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Abstract 

In  the  continuum,  close  connections  exist  between  mean  curvature  flow,  the  Allen-Cahn 
(AC)  partial  differential  equation,  and  the  Merriman-Bence-Osher  (MBO)  threshold  dy¬ 
namics  scheme.  Graph  analogues  of  these  processes  have  recently  seen  a  rise  in  popularity 
as  relaxations  of  NP-complete  combinatorial  problems,  which  demands  deeper  theoretical 
underpinnings  of  the  graph  processes.  The  aim  of  this  paper  is  to  introduce  these  graph 
processes  in  the  light  of  their  continuum  counterparts,  provide  some  background,  prove  the 
first  results  connecting  them,  illustrate  these  processes  with  examples  and  identify  open 
questions  for  future  study. 

We  derive  a  graph  curvature  from  the  graph  cut  function,  the  natural  graph  counterpart 
of  total  variation  (perimeter).  This  derivation  and  the  resulting  curvature  definition  differ 
from  those  in  earlier  literature,  where  the  continuum  mean  curvature  is  simply  discretized, 
and  bears  many  similarities  to  the  continuum  nonlocal  curvature  or  nonlocal  means  formu¬ 
lation.  This  new  graph  curvature  is  not  only  relevant  for  graph  MBO  dynamics,  but  also 
appears  in  the  variational  formulation  of  a  discrete  time  graph  mean  curvature  flow. 

We  prove  estimates  showing  that  the  dynamics  are  trivial  for  both  MBO  and  AC  evolu¬ 
tions  if  the  parameters  (the  time-step  and  diffuse  interface  scale,  respectively)  are  sufficiently 
small  (a  phenomenon  known  as  “freezing”  or  “pinning”)  and  also  that  the  dynamics  for  MBO 
are  nontrivial  if  the  time  step  is  large  enough.  These  bounds  are  in  terms  of  graph  quantities 
such  as  the  spectrum  of  the  graph  Laplacian  and  the  graph  curvature.  Adapting  a  Lyapunov 
functional  for  the  continuum  MBO  scheme  to  graphs,  we  prove  that  the  graph  MBO  scheme 
converges  to  a  stationary  state  in  a  finite  number  of  iterations.  Variations  on  this  scheme 
have  recently  become  popular  in  the  literature  as  ways  to  minimize  (continuum)  nonlocal 
total  variation. 

Keywords:  spectral  graph  theory,  Allen-Cahn  equation,  Ginzburg-Landau  func¬ 
tional,  Merriman-Bence-Osher  threshold  dynamics,  graph  cut  function,  total  vari¬ 
ation,  mean  curvature  flow,  nonlocal  mean  curvature,  gamma  convergence,  graph 
coarea  formula. 
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1  Introduction 


Motion  by  mean  curvature  and  flows  involving  mean  curvature  in  general  appear  in  many  im¬ 
portant  continuum  models,  including  models  coming  from  materials  science  [MS64,  Tay92], 
fluid  dynamics  [HS98],  and  combustion  [XY11,  PetOO].  All  such  models  involve  a  front  prop¬ 
agating  with  a  velocity  depending  on  the  mean  curvature  of  the  front.  Recently,  there  has 
been  an  increasing  interest  in  using  ideas  from  continuum  PDEs  (related  to  mean  curva¬ 
ture)  in  discrete  applications  such  as  image  analysis,  machine  learning  and  data  clustering 
[BF12,  vGB12,  MKB12,  GCMB+13,  HLPB13]. 

This  paper  initiates  a  systematic  study  of  the  definition  of  mean  curvature  for  vertex  sets 
of  an  arbitrary  graph  G  =  (V,  E).  We  examine  the  effectiveness  of  the  algorithms  in  the  recent 
papers  mentioned  above  and  on  how  they  may  be  improved.  The  graphs  considered  are  arbitrary 
graphs  and  are  not  necessarily  obtained  as  the  discretization  of  a  continuum  problem,  so  our 
perspective  is  only  parallel  to  one  that  is  purely  motivated  by  numerical  analysis.  In  particular, 
we  do  not  assume  an  embedding  of  the  graph  in  a  low  dimensional  space. 

Of  course,  the  various  definitions  of  curvature  in  the  (usual)  continuum  setting  (see  Ap¬ 
pendix  A  for  a  brief  review  and  some  references)  motivate  and  inform  our  approach  to  defining 
the  curvature  of  a  vertex  set  S  C  V  using  the  discrete  total  variation  norm  and  the  discrete 
divergence  of  a  “normal”  edge  flow.  Since  they  are  closely  related  to  questions  of  mean  cur¬ 
vature,  the  Allen-Cahn  equation  and  the  MBO  scheme  for  arbitrary  graphs  G  =  (V,  E)  arise 
naturally  in  the  present  investigation.  Theoretical  and  numerical  examples  are  used  to  highlight 
possible  connections  between  all  these  concepts,  leading  to  a  number  of  open  questions  given  in 
Section  7. 

Graphs  Laplacians,  Allen-Cahn,  and  MBO.  Graph  Laplacians  are  the  central  objects 
of  study  in  spectral  graph  theory  [Chu97].  These  graph  operators  share  many  properties  with 
their  continuum  counterparts.  The  Allen-Cahn  equation  on  the  graph  V  is  defined  in  terms 
of  the  graph  Laplacian,  A,  and  any  (typically  bistable  quartic)  potential,  W.  One  considers  a 
phase  field,  u :  V  x  M+  — >  M,  solving  the  differential  equation, 

u  =  —A  u  —  -iy7(u). 

This  nonlinear  equation  has  received  greater  attention  recently,  spurred  by  some  of  the  appli¬ 
cations  mentioned  above.  The  graph  Allen-Cahn  equation  was  introduced  in  the  context  of 
data  classification  in  [BF12]  and,  in  a  number  of  examples,  was  shown  to  be  both  accurate  and 
efficient.  As  is  well  known,  the  continuum  Allen-Cahn  equation  is  the  L2  gradient  flow  associ¬ 
ated  to  the  Ginzburg-Landau  functional.  This  is  also  true  in  the  graph  setting.  In  [vGB12]  it 
was  shown  that  the  graph  Ginzburg-Landau  functional  T-converges  to  the  graph  cut  objective 
functional  on  graphs,  if  the  characteristic  length  scale  e  goes  to  zero.  Moreover,  a  relationship 
between  the  graph  cut  functional  and  the  continuum  total  variation  functional  was  given.  At 
the  same  time,  the  continuum  Allen-Cahn  solution  is  known  to  converge  to  mean  curvature 
flow,  when  e  — >  0  [BSS93].  Furthermore,  the  mean  curvature  is  directly  related  to  the  first  vari¬ 
ation  of  the  total  variation  functional.  In  this  paper  we  therefore  study  the  graph  Allen-Cahn 
equation  and  make  connections  between  it  and  a  graph  cut  derived  ‘graph  curvature’. 

The  third  ingredient  in  this  paper  is  the  threshold  dynamics  or  Merriman-Bence-Osher 
(MBO)  algorithm  on  graphs.  Its  continuum  counterpart  was  introduced  in  [MB092,  MB093] 
and  consists  of  iteratively  solving  the  heat  equation  for  a  short  time,  r,  and  thresholding  the 
result  to  an  indicator  function.  It  is  known  that,  for  short  diffusion  times  r,  this  approximates 
mean  curvature  flow  [Eva93,  BG95].  In  this  paper  we  therefore  also  study  the  connections 
between  the  graph  MBO  scheme,  the  graph  AC  equation,  and  graph  curvature. 
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In  a  recent  series  of  papers  [ELB08,  TEL08,  DELT10,  DEL11,  TEL11,  DEL12a,  EDL12, 
DEL12c,  DEL12b]  Elmoataz  et  al.  study  partial  differential  equations  and  front  propagation  on 
graphs,  mainly  from  a  numerical  point  of  view.  In  these  papers  the  1-Laplacian  on  a  graph  is 
used  as  curvature,  which  differs  from  our  approach.  We  use  the  anisotropic  graph  total  variation 
instead  of  the  isotropic  total  variation  (see  Section  2  for  definitions),  since  [vGB12]  suggests 
that  is  the  natural  total  variation  on  graphs. 

A  common  obstacle,  when  transferring  results  and  intuitions  from  the  continuum  to  graphs, 
is  the  (implicit)  lower  bound  on  the  accessible  length  scales  on  a  graph.  We  show  in  Theorem  5.3 
and  Theorem  4.2,  that  if  e  or  r  is  too  small,  then  the  Allen-Cahn  equation  exhibits  “freezing” 
(or  “pinning”),  or  the  MBO  scheme  is  stationary,  respectively.  Hence  the  interesting  dynamics 
happen  at  small  but  positive,  e  or  r,  rather  than  in  the  limits  as  e  — >  0  or  r  — »  0.  Related  is 
the  lack  of  a  chain  rule  for  derivatives  on  graphs1,  which  can  be  traced  back  to  the  absence  of 
‘infinitesimal’  length  scales  on  a  graph.  As  a  consequence,  the  level  set  approach,  which  has 
proven  very  successful  in  describing  continuum  mean  curvature  flow,  is  not  independent  of  the 
level  set  function  on  a  graph. 

New  results.  The  finite  spectral  radius  of  the  Laplacian  is  used  to  derive  explicit  bounds 
on  the  parameters  for  both  threshold  dynamics  (MBO)  and  the  graph  Allen-Cahn  equation 
that  guarantee  “freezing”  or  “pinning”  of  the  initial  phase,  a  phenomenon  that  has  been  ob¬ 
served  in  numerical  simulations  and  is  well  known  for  discretizations  of  the  continuum  processes 
[MB092]  [MB094,  Section  4.4], 

In  the  opposite  direction,  an  argument  based  on  the  comparison  principle  is  used  to  obtain 
a  lower  bound  for  the  MBO  time  step  that  guarantees  that  a  specific  node  of  the  phase  changes 
in  a  single  MBO  iteration.  This  bound  is  given  in  terms  of  a  new  notion  of  mean  curvature  for 
general  graphs,  and  as  such,  it  is  a  “local”  quantity  (as  opposed  to  one  coming  from  spectral 
data).  Such  local  bounds  may  be  of  use  in  developing  adaptive  time  stepping  schemes  that 
complement  the  (spectral)  adaptive  schemes,  such  as  those  developed  for  discretizations  of  the 
continuum  mean  curvature  flow  [Ruu96,  ZD09].  In  this  sense,  introducing  the  graph  mean 
curvature  and  highlighting  its  connection  with  subjects  in  continuum  PDE  (MBO,  Ginzburg- 
Landau,  and  nonlocal  mean  curvature)  and  graph  theory  (graph  cuts,  connectivity)  are  the 
main  contributions  of  this  work. 

The  results  in  Sections  4  and  5  and  the  numerical  evidence  and  explicit  examples  in  Section  6 
suggest  several  open  questions  about  the  graph  MBO  scheme,  the  graph  Allen-Cahn  equation, 
and  graph  mean  curvature,  which  are  discussed  in  Section  7.  These  are  interesting  questions 
for  future  work. 

Outline.  In  Section  2  the  relevant  graph  based  calculus  is  introduced,  setting  the  notation  for 
the  rest  of  the  paper.  In  particular,  the  graph  Laplacian  and  its  basic  properties  are  discussed. 
Section  3  discusses  curvature  and  mean  curvature  flow  on  a  graph.  Sections  4  and  5  discuss 
the  MBO  scheme  and  Allen-Cahn  equation  on  graphs,  respectively,  and  sufficient  conditions 
are  given  on  the  parameters  to  guarantee  freezing  or  pinning  of  the  initial  conditions.  Section  6 
explores  the  graph  processes  and  concepts  introduced  in  these  previous  three  sections  through 
theoretical  and  computational  examples.  Finally,  we  conclude  in  Section  7,  with  a  discussion 
and  a  few  open  questions  based  on  the  new  estimates  and  examples  from  previous  sections.  In 
Appendix  A,  we  make  some  remarks  on  the  continuum  mean  curvature  flow.  In  Appendix  B, 
we  derive  the  graph  co-area  formula.  Appendix  C  discusses  some  similarities  between  the  graph 
Laplacian,  the  graph  1-Laplacian,  and  the  graph  curvature. 

lrThat  is,  if  u  €  V  and  /:  V  — >  V,  then  V/(u)  ^  LfVu  for  any  linear  operator  Lf. 
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2  Setup 

We  work  on  a  finite* 2,  undirected  graph  G  =  (V,E)  with  vertex  set3  V  =  {i}/=1  and  edge  set 
E  C  V2.  The  graph  is  weighted;  each  edge  (■ i,j )  G  E,  incident  on  nodes  i  and  j,  is  assigned  a 
weight  ujij  >  0.  Since  the  graph  is  undirected,  (i,j)  is  identified  with  ( j,i )  and  ujij  =  ujji.  To 
simplify  notation  we  extend  to  be  zero  for  all  (i,j)  G  V2  which  do  not  correspond  to  an 
edge.  The  degree  of  node  i  is  di  :=  Denote  the  maximal  and  minimal  degrees  by 

d+  :=  maxjgy  di  and  cL  :=  min i^v  di-  We  assume  that  G  has  no  isolated  nodes,  i.e.,  d-  >  0. 
For  each  i  G  V  we  then  have  a  non-empty  set  of  neighbors  J\fl  :=  {j  G  V  :  u!tj  >  0}.  We  also 
assume  G  has  no  self-loops,  i.e.,uju  =  0.  In  particular  i  0  J\Tt. 

Let  V  be  the  space  of  all  functions  V  — >  M  and  £  the  space  of  all  skew-symmetric4  functions 
E  — >  M.  Again  to  simplify  notation,  we  extend  each  p  G  £  to  a  function  p:  V2  — >  M  by  setting 
Pij  =  0  if  (i,  j)  0  E.  As  justified  in  earlier  work  [vGB12]  we  introduce  the  following  inner 
products  and  operators  for  parameters  q  G  [1/2, 1]  and  r  G  [0, 1]: 

(u,v)v  ■=  £^<,  (<P,(f>)£  ■=  \  £  Vij&ijUij-1 ,  ( <p  ■  4>)i 

iev  ijGV 

(Vu)ij  :=  wlfq(uj  -  Hi ),  (div<p)i  :=  £ ^(Pji  ~  (pij). 

1  j&V 

Note  that  in  the  sum  in  {ip,  fyg  the  indices  i  and  j  both  run  over  all  nodes.  The  edges  (i,j)  and 
(j,i)  are  counted  separately,  hence  the  correction  factor  !2 .  Note  that  the  powers  2q  —  1  and 
1  —  q  in  the  £  inner  product  and  ‘dot  product’  and  in  the  gradient,  are  zero  for  the  admissible 
choices  q  =  \  and  q  =  1  respectively.  In  these  cases  we  define  =  0  whenever  uj^j  =  0,  so  as 
not  to  make  the  inner  product,  ‘dot  product’,  or  gradient,  nonlocal  on  the  graph.  The  inner 
products  on  V  and  £'5  are  analogous  to  a  weighted  L2  inner  products  in  the  continuum  case, 
while  the  ‘dot  product’  inner  product  (p  ■  <f>)i  is  analogous  to  a  weighted  dot  product  between 
vector  (field)  s  (without  the  integration  of  the  L 2  inner  product).  A  direct  computation  shows 
that  div  and  V  are  adjoints  with  respect  to  (•,  -)y  and  (•,  •)£,  namely  for  u  G  V  and  <fi  G  £  we 
have 


9  £ 


2q—l 


jev 


(Vu,(j)}£  =  (u,div</>)v. 

The  characteristic  function  of  a  node  set  S  C  V  is  y«gV,  defined  via  (ys)i  '■=  •< 

[0  if  *0  5. 

This  leads  to  the  following  associated  norms,  Laplacians,  set  measures,  and  total  variation 
functionals: 

•  Inner  product  norms,  ||u||v  :=  y/ (u,  u)y  and  ||yj||f  :=  y/ (<p,  p)£. 

•  Maximum  norms6  ,  ||u||v,oo  :=  max{|u/ :  i  G  V}  and  ||^||£i0o  :=  max{|^j:  i,j  G  V}. 

“In  this  paper,  we  are  working  with  a  fixed  graph  G  with  a  finite  number  of  nodes.  In  no  sense  are  we 

considering  a  sequence  of  graphs  or  taking  a  “continuum  limit”. 

JWe  will  use  the  terms  “vertex”  and  “node”  interchangeably. 

4 The  necessity  of  skew-symmetry  may  not  be  obvious  at  this  point,  but  it  is  a  common  requirement  for 
consistency  of  certain  concepts  in  discrete  calculus,  see  e.g.,  [FT04,  GO09,  GP10,  CMOP11]. 

5Note  that  (•,  -)s  is  indeed  an  inner  product  on  the  space  of  (skew-symmetric)  functions  E  — >  R,  but  not 
for  the  space  of  functions  V2  — >  R,  because  for  those  functions  the  ‘inner  product’  can  be  zero  for  nontrivial 
functions. 

6To  justify  these  definitions  and  convince  ourselves  that  there  should  be  no  u>ij  or  di  included  in  the  maximum 
norms  we  define  ||<p|||  p  :=  |  JT  2ev  .  Adapting  the  proofs  in  the  continuum  case,  e.g.,  [Ada75,  Theorems 
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•  The  norm  corresponding  to  the  dot  product  \tp\ j  :=  \J {ip  ■  (p)i.  Note  that  \4>\  €  V. 

•  The  Dirichlet  energy 

xl|Vu||f  =  jYl 

ijeV 

•  The  graph  Laplacian  A  :=  div  o  V :  V  — >  V.  So 

(A u)i  :=  djr  Y  ujjj(ui  -  Uj). 
jev 

It  is  worth  noting  that  the  sign  convention  for  the  graph  Laplacian  is  opposite  to  that 
used  for  the  continuum  Laplacian  in  most  of  the  PDE  literature  (in  particular,  the  graph 
Laplacian  A  is  a  positive  semidefinite  operator). 

When  r  =  0,  A  is  referred  to  as  the  unnormalized  weighted  graph  Laplacian.  When 
r  =  1,  A  is  referred  to  as  the  asymmetric  normalized  graph  Laplacian  or  random  walk 
graph  Laplacian.  Another  Laplacian,  often  encountered  in  the  spectral  graph  theory 
literature,  is  the  symmetric  normalized  graph  Laplacian.  This  one  falls  outside  the  scope 
of  the  current  setup  and  will  not  be  considered  in  this  paper.  For  general  references  on 
the  graph  Laplacian,  consult  [Moh91,  Chu97,  vL07,  BLS07]. 

•  For  S  C  V,  the  set  measures 

vo1  5  =  =  llxs|lv» 

ieS 

|S|  =  number  of  vertices  in  S. 

Note  that  \S\  is  just  a  special  case  of  vol  S,  for  r  =  0  (recall  we  assume  d-  >0). 

•  The  isotropic  and  anisotropic  total  variation  TV :  V  — >  M  and  TV(j :  V  — >  M  respectively: 


TV (u)  :=  max{(div<^,u)y :  (p  E  £,  max  tab  <  1} 

iev 

=  ^|vn|.  =  ^^2  l^uijiui  -Uj)2 
i&v  iev  y  j&v 

TV^(u)  :=  max{(div cp,u)y:  pe£,  |M|£j0O  <  1} 
=  (Vu,sgn(Vu))£  =  -  ^  ufjliii  Uj | . 

i,j£V 


Here,  the  signum  function  is  understood  to  act  element-wise  on  the  elements  of  Vu.  We 
note  that  the  maximum  in  the  definition  of  TV  is  achieved  by  taking 


TV 

<Pij  =  <Pij  :z 


\Vu\i  ^  0 

|Vu|j  =  0. 


2.3  and  2.8],  to  the  graph  situation  we  can  prove  a  Holder  inequality  |ta<A||£  i  <  \\t\\s  p||?H|£  p'  for  1  <  p,p'  <  oo 

(l  2  :V  ' 

such  that  -  +  V  =  1,  an  embedding  theorem  of  the  form  ||  <^11  £,p  5  E  1  ]ta||£, s  for  1  <  p  <  s  <  oo, 

\2  i~jtv  ) 

and  the  limit  lim  |ta||£,p  =  |talk>°°-  A  similar  result  holds  for  the  norms  on  V. 

p—too 
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The  maximum  in  the  definition  of  TVf  is  achieved  by 

(p  =  ipa  :=  sgn(Vu). 


(1) 


Note  that  the  values  ipT^'  and  (pa  take  on  the  set  {Vic  =  0}  are  irrelevant  for  achieving 
the  maximum,  hence  these  functions  are  not  uniquely  determined.  The  quantity  divj^j 
is  often  referred  to  as  the  1-Laplacian  of  u. 

The  anisotropic  total  variation  of  the  indicator  function'  for  the  set  S  C  V,  denoted  xSi  is 
given  by 

TVa(xs)  =  E  <4-  (2) 

ieS,jeSc 

Thus,  the  total  variation  of  a  set  S  is  equivalent  to  the  graph  cut  between  S  and  Sc  :=  V  \  S 
which  is  used  in  graph  theory  and  spectral  clustering  [SM00].  For  future  reference  it  is  useful 
to  note  that 

TVa(xs)  =  (Vxs,  Vxs)£  =  {xs,  A xs)v-  (3) 

Lemma  2.1.  The  norms  ||  ■  ||v  and  ||  •  ||v,oo  are  equivalent,  with  optimal  constants  given  by 

d-IM|v,oo  <  ||«||v  <  Vvol  ’F||u||v,oo- 

Proof.  We  compute  ||u||^  =  d^uj  <  maxiey  uf  YfieV  =  (v°l  ^0  IMIv,oo>  which  is 

saturated  if  u  =  xv-  Also,  ||u||y  =  d^uf  >  df  maxiey  u'f  =  £T.||w||y  ao.  If  j  &  V  is  such 

that  dj  =  d- ,  this  bound  is  attained  for  u  =  X{j\ ■  □ 

Next  we  recall  the  definitions  of  node  set  boundaries  and  (signed)  graph  distance. 

Definition  2.2.  For  j  E  A/)  ,  we  define  dfj  :=  * ,  and  we  set  dG(i,i )  :=  0.  A  path  onV  is  a 

sequence  7  =  {*1,^2,  im}  for  some  m  E  N  such  that  ik+ 1  E  J\flk  for  each  k  E  {1, . . . , m  —  1}. 
Given  a  path  7  =  {i\,  ...,im},  its  length  is  defined  as 

m— 1 

l7l  :=  E4+  r 

i=l 


Then,  the  graph  distance  between  arbitrary  i,j  €  V  is  given  by 

dfj  :=  min  |7| 

where  the  minimum  is  taken  over  all  paths  7  with  h  =  i,iN  =  j ■  dn  other  words,  dG  is  the 
minimal  distance  to  go  from  node  i  to  node  j,  traveling  only  via  existing  edges,  where  each  edge 
represents  a  distance  ujG  1 .  For  a  given  set  S  C  V,  we  define  the  graph  distance  to  S  at  each 
node  as  the  minimal  graph  distance  to  a  node  in  S: 

df  :=  min  cm). 
jeS  lJ 


As  argued  in,  for  example,  [MOS12,  Section  3.1,  Example  2],  ds  is  the  solution  u  E  V  to  the 
equation 

( min jsN,i(Vu)ij  +  1  =  0  if  i£V\S, 

=  0  if  i  E  S. 


(4) 


7For  xv,  the  indicator  function  of  the  full  node  set,  we  also  write  the  constant  function  1. 
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Definition  2.3.  The  boundary  of  S  C  V  iss 

dS  :={ie  S:  3 j  G  V  s.t.  (Vxs)ij  <  0}. 

Note  that  dS  C  S.  Alternative  definitions  appear  in  the  literature  in  which  dS  C  Sc. 


2.1  Basic  spectral  properties  of  the  graph  Laplacian,  A 


In  this  section,  we  collect  a  number  of  spectral  properties  of  the  graph  Laplacian  A :  V  — > 
V.  Further  discussion  and  details  for  the  special  cases  r  =  0  and  r  =  1  can  be  found  in 
[Moh91,  Chu97,  vL07,  BLS07],  from  which  our  presentation  follows. 

Note  that  A:  V  — >  V  is  a  self-adjoint  operator  in  the  V  norm.  For  u  G  V  \  {0},  the  Rayleigh 
quotient  R  :  V  — >  M  is  defined  as 


R(u) 


(u,Au)v  =  ]{Vujj| 

IMIv  ||u||y 


The  eigenvalues  of  the  graph  Laplacian,  A,  are  then  defined  via  the  variational  formulation, 


Xu  =  min  max  R(u).  (5) 

Fkcv  ueFk\{0} 

subspace  of  dim  k 

The  minimum  in  (5),  is  attained  when  Fj,  is  spanned  by  the  first  k  eigenfunctions,  i.e.,  the 
eigenfunctions  corresponding  to  the  k  smallest  eigenvalues,  counting  multiplicities.  In  particular, 
there  are  n  non-negative  real  eigenvalues  (counted  with  multiplicity),  denoted  {Afe}^=1.  If  we 
denote  the  span  of  the  first  k  —  1  eigenfunctions  by  Ff.- 1  =  span({ui}\~l) ,  then  (5)  can  be 
rewritten 


Afc  =  min  R(u).  (6) 

uev\{0} 

i 

where  u  Ty  Fk- 1  indicates  that  u  is  orthogonal  (in  the  sense  of  (•,  -)y)  to  Ui,  for  i  G  {1, . . . ,  k— 1}. 
Taking  variations  of  the  Rayleigh  quotient  with  respect  to  u,  we  find  that  (A k,Uk)  satisfies  (6) 
if  and  only  if  u  _l_v  Fk- 1  and,  for  all  v  G  V, 

(A  uklv)v  =  A  k(uk,v)v-  (7) 

Finally,  unwinding  the  definitions,  we  find  that  (7)  is  equivalent  to  the  matrix  eigenvalue  problem 

Lx  =  Xx  where  L  =  D~r{D  —  A],  x  G  Mn,  (8) 

where  AtJ  =  and  Da  =  di  is  a  diagonal  matrix8 9.  Recall  that  the  spectral  radius  p  of  A  is 
defined  as  the  maximum  of  the  absolute  values  of  the  eigenvalues  of  A, 

p( A)  :=  max  Aj  =  Xn  =  sup  R{u). 

«ev\{0} 

8  Similarly,  by  changing  the  “strictly  less  than”  inequalities  into  “strictly  larger  than”  inequalities  the  boundary 
d(Sc)  of  the  set  Sc  is  defined.  The  reduced  boundary  of  S  can  be  defined  as  the  following  subset  of  dS  (compare 
with  the  continuum  case  in  [AFP00,  Definition  3.54]): 

d*S:={ieS:3\jev-.  (Vxs)y  <  0}, 

and  again  similarly  for  d*Sc. 

9Note  that  here  we  use  the  fact  that  there  are  no  isolated  vertices,  i.e.,  di  >0  for  all  i  £  V. 
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Lemma  2.4  (Spectral  properties  of  the  graph  Laplacian,  A).  The  following  properties  are 
satisfied: 

(a)  The  smallest  eigenvalue  is  X\  =  0.  The  multiplicity  of  Ai  =  0  is  the  number  of  connected 
components  of  the  graph  and  the  associated  eigenspace  is  spanned  by  set  of  indicator  vectors 
for  each  connected  component.  If  there  is  only  one  connected  component,  X\  is  simple  and 
the  first  (unnormalized)  eigenfunction  is  u\  =  1  =  \v  ■ 

(b)  The  operator  norm  of  A,  ||A||y  :=  supU7i0  ,  and  the  spectral  radius  are  equal,  ||A||y  = 
p( A).  This  implies  that,  for  all  u  G  V, 

II Au||v  <  p( A)  |M|v- 


(c)  The  trace  satisfies  tr  A  =  Ylk=i  =  Yli^v  r  •  Consequently, 


X2  < 


1 


n  — 


< 


n  d\r 


i&V 


n  —  1 

(d)  If  G  is  not  a  complete  graph  then 


X2  <  min 


and  An  >  - -  d] 

n  -  1  z— '  * 

i&V 


didf  +  djrdi 


> 


n  d 


1 —r 


n  —  1 


(9) 


a, me  dfdf  +  dfd) 


(e)  The  second  eigenvalue  satisfies 


X2  <  min 
SC.V 


(vol  V)TVpxs)  TVpxs) 

(vol  S)(vol  Sc)  "scv  min(vol  S,  vol  Sc) ' 


(f)  The  spectral  radius  of  A  satisfies  p  <  2  d\  r 


Proof,  (a)  These  properties  follow  directly  from  (6). 

(b)  Noting  that  A  is  a  self-adjoint  operator,  a  proof  can  be  found  in,  for  example,  [RS79, 
Thm.  VI. 6]. 

(c)  Because  the  trace  of  the  operator  A  is  equal  to  the  trace  of  its  matrix  representation, 
we  have  tr  A  =  tr  L  =  Ylk= Since  we  assume  there  are  no  self- loops  in  the  graph, 
tr  D~rA  =  0,  hence  tr  L  =  tr  D1~r .  Equation  (9)  follows  from  the  fact  that  Ai  =  0  and  the 
maximum  (minimum)  of  a  set  is  greater  (less)  than  or  equal  to  the  mean  of  the  set. 

(d)  If  G  is  not  a  complete  graph,  then  there  exists  an  (a,  b )  ^  E.  We  define  the  test  function 
v  G  V, 


Vi  = 


-dr 

aa 


i  =  a 
i  =  b 

otherwise. 


Note  that  (v,  l)y  =  0.  The  desired  upper  bound  then  follows  from  (6). 
(e)  For  S  G  V,  define  the  test  function  v  G  V, 


vol  Sc 
—vol  S 


Vi  = 


9 


i  g  S 
i  €  Sc. 


Then  (v,  l)v  =  Oand||u||y  =  (vol  S'c)(vol  5)vol  V.  Using  (2),  we  compute  ||Vu|||  =  (vol  D^TV^xs). 
The  first  inequality  then  follows  from  (6).  For  the  second  inequality, 

(vol  V)T vJCxs)  =  'rVg(xs)  TV^ixs)  TV^txs) 

(vol  S)  (vol  Sc)  vol  S  vol  Sc  ~  min(vol  S,  vol  Sc) 

(f)  Using  the  identity  (a  —  b )2  <  2 (a2  +  b2),  we  compute 


p{  A)  =  sup  R{v)  =  sup 


«ev\{o} 


«ev\{o} 


<  sup 


Eyt«y(«?  +  u?) 


«ev\{o}  J2idiui 
0  X H  diU2 

=  SUP  2 

«ev\{o} 


If  j  E  V  is  such  that  dj  =  d+,  then  the  supremum  is  attained  by  the  vector  u  =  X{j}  and  the 
result  follows.  □ 

The  following  lemma  states  properties  of  the  diffusion  operator  e^At :  V  — >  V. 

Lemma  2.5  (Diffusion  on  a  graph).  Let  u(t)  :=  e~Atuo  for  t  >  0  denote  the  evolution  ofuo^V 
by  the  diffusion  operator.  The  following  properties  hold. 

(a)  The  mass, 

M(u)  :=  {u,xv)v  =  ^2uidri,  (10) 

i£V 

is  conserved,  i.e.,  M(u(t ))  =  M(uq)  for  all  t  >  0. 

(b)  =  — 2||Vit|||  <  0.  In  particular,  ||e_A*Mo||v  <  ||«o||v- 

(c)  Let  the  mass,  M,  be  defined  as  in  (10),  A2  be  the  second  eigenvalue  of  the  graph  Laplacian, 

and  e  >  0.  Assume  the  graph  is  connected.  If  t  >  ^  log  d_ 5  ||ito  —  (v°l  U)_1M||y^ , 

then 

|| u(t)  —  (vol  V)~1M ||v,oo  <  £,  for  all  t  >  r. 

(d)  (Comparison  Principle)  If,  for  all  j  E  V,  ( uo)j  <  (vo)j,  then  (e~tAuo)j  <  (e~tAvo)j,  for 
all  j  E  V  and  t  >  0.  In  particular,  ||e_tAuo||v,oo  <  ||'«o||v,oo- 

If  V  is  connected,  the  strong  comparison  principle  holds:  If,  for  all  j  E  V ,  {uf)j  <  (vq )j, 
and  for  some  jo  E  V,  (uo)j0  <  (vo)j0,  then,  for  all  k  E  V  and  t  >  0,  (e~tAuo)k  < 
(e~tAv0  )k- 

Proof,  (a)  We  compute  ftM(u)  =  ( u,Xv)v  =  ~(Au,xv)v  =  -{^u,S/xv)s  =  0. 

(b)  We  compute  |(^|M|y  =  (u,u)\>  =  —(u,Au)y  =  —  (Vu,  Vu)s  =  — ||Vu|||. 

(c)  If  {(A j,Vj)}j=1  denote  the  eigenpairs  of  the  graph  Laplacian  with  V-normalized  eigen¬ 
vectors,  then  the  spectral  decomposition  of  u  is  given  by 

n 

u(t)  =  (uo,Vj)v  Vj.  (11) 

3= 1 
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Recalling  from  Lemma  2.4  that  Ai  =  0  and  v\ 


(vol  V)  ?xv  and  using  (11),  we  compute 


u  -  (vol  V)  1M||v  =  ||^e  Xjt(u0,Vj)v  Vj\\y  <  e  X2t\\u0  -  (vol  V)  1M ||y. 

j>  i 


But  by  Lemma  2.1,  this  implies 

||u  —  (vol  V)~1M ||v,oo  <  d_2 e~X2t\\uo  —  (vol  V)~lM\\v. 

The  result  now  follows,  since  by  Lemma  2.4,  A2  >  0  for  a  connected  graph. 

(d)  If  uq  =  vo,  then  u(t)  =  v(t),  for  all  t  >  0,  by  the  uniqueness  theorem  for  ordinary 
differential  equations.  In  this  case  there  is  nothing  more  to  prove.  Moreover,  by  repeating  the 
argument  on  each  connected  component  we  may  assume  without  loss  of  generality  that  the 
entire  graph  is  connected. 

Let  uo,vo  be  such  that,  for  all  j  E  V,  we  have  (uo)j  <  (vo)j,  and  for  some  jo  E  V, 
(uo)j0  <  (vo)jo-  We  will  show  that  in  this  case  Uj(t)  <  Vj(t),  for  every  j  E  V  and  all  t  >  0, 
which  proves  the  strong  comparison  principle  and  in  particular  the  comparison  principle. 

Arguing  by  contradiction,  suppose  that  Vj(t)  >  Uj(t)  for  some  t  and  some  j.  Let  to  be  the 
last  time  v(t)  lies  everywhere  above  u(t),  that  is 

to  :  =  sup{  t  >  0  :  V  s  E  (0,  t)  we  have  Vj  Uj(s)  <  Vj(s)}. 

By  our  assumption  we  have  that  0  <  to  <  00  ■  Then,  from  the  definition  of  to  there  is  some 
k  E  V  such  that  uk(to)  =  Vk(to)  and  Ukito )  >  Vk(to)-  Moreover,  again  due  to  the  definition  of 
to,  Uj (to)  <  Vj(to),  for  all  j  E  V .  This  shows  that  if  it* (to)  <  Vi(to)  for  some  neighbor  i  of  k, 
then  (— Au(t0))fc  >  (-A v(t0))k  and 

0  =  foc(to)  +  (A u(t0))k  <  Ufe(to)  +  (A v(t0))k  =  0, 

which  is  a  contradiction.  We  conclude  that  it  (to)  =  v(to)  at  all  neighbors  of  k,  and  by  iterating 
the  above  argument  we  get  in  fact  that  it  (to)  =  v(to)  at  all  nodes  of  V,  since  we  are  dealing  with 
the  case  where  V  is  connected.  By  the  uniqueness  theorem  for  ordinary  differential  equations 
we  conclude  that  ito  =  vq  at  all  nodes,  which  gives  a  contradiction,  and  the  strong  comparison 
principle  is  proved.  □ 

Remark  2.6.  The  dependence  of  the  convergence  of  u(t)  to  the  steady  state  (vol  V)~lMxv 
on  the  second  Laplacian  eigenvalue,  A2,  in  Lemma  2.5(c),  is  related  to  the  rate  of  convergence  of 
a  Markov  process  on  a  graph  to  the  uniform  distribution  [SBXD04].  Due  to  this  property,  A^"1  is 
sometimes  referred  to  as  the  mixing-time  for  a  graph.  The  second  eigenvalue  A2  is  also  referred 
to  as  the  algebraic  connectivity  or  Fiedler  value  for  a  graph  [Fie73],  and  plays  an  important  role 
in  many  applications.  The  robustness  of  a  network  to  node/edge  failures  is  highly  dependent  on 
the  algebraic  connectivity  of  the  graph.  In  the  “chip-firing  game”  of  Bjorner,  Lovasz,  and  Shor, 
the  algebraic  connectivity  dictates  the  length  of  a  terminating  game  [BLS91].  The  algebraic 
connectivity  is  also  related  to  the  informativeness  of  a  least-squares  ranking  on  a  graph  [0B013] . 
Consequently,  algebraic  connectivity  is  a  measure  of  performance  for  the  convergence  rate  in 
sensor  networks,  data  fusion,  load  balancing,  and  consensus  problems  [OSFM07]. 

2.2  Relation  between  graph  Laplacians  and  balanced  graph  cuts 

In  spectral  graph  theory  there  are  some  well  known  connections  between  the  various  graph 
Laplacians  and  different  normalizations  of  the  graph  cut  TV^xs)  =  from  (2)-  For 
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example,  when  r  =  0  (and  hence  vol  S  =  |5|),  the  quantity 


TVa\Xs) 


in  Lemma  2.4(d) 


min(vol  S,  vol  Sc 

is  the  Cheeger  cut  and  its  minimum  over  S  C  L  is  the  Cheeger  constant ,  e.g .,  [SMOO,  SB10]. 
Let  S\, . . . ,  Sk  be  a  partition  of  all  the  nodes  V,  for  a  given  integer  k  and  define  the  balanced 
graph  cut 


C'r(51,...,5fc)  :=£ 


TVj(X5j 
vol  Sj 


i=  1 

We  use  the  subscript  r  to  remind  us  that  vol  Si  depends  on  r.  For  r  =  0  this  quantity  is 
known  in  the  literature  as  the  ratio  cut10,  for  r  =  1  as  the  normalized  cut  [SMOO,  vL07].  These 
quantities  are  introduced,  because  minimization  of  the  graph  cut,  without  a  balancing  term 
in  the  denominator,  often  leads  to  a  partition  with  many  singleton  sets,  which  is  typically 
unwanted  in  the  application  at  hand.  Minimization  of  this  balanced  cut  over  all  partitions  of 
V  is  an  NP-complete  problem  [WW93,  SMOO],  but  a  relaxation  of  this  problem  can  be  defined 
using  the  graph  Laplacian.  For  example,  in  [vL07,  Section  5]  it  is  shown  that 

Cr(Si,...,Sk)  =  Tr(HTLH), 


where  H  is  an  n  by  k  matrix  with  elements 


hij 


(voi  Sj) 
0 


1 

2 


if  i  €  Sj, 
else. 


(12) 


Note  that  H  is  orthonormal  in  the  V  inner  product,  i.e.,  HTDrH  =  I,  where  I  is  the  k  by  k 
identity  matrix.  Hence  the  minimization  of  Cr  over  all  partitions,  is  equivalent  to  minimizing 
Tr (HtLH)  over  all  n  by  k  V-orthonormal  matrices  of  the  form  as  in  (12).  The  relaxation  of 
this  NP-complete  minimization  problem  is  now  formulated  by  dropping  the  condition  (12)  and 
minimizing  over  all  n  by  k  V-orthonormal  matrices.  The  problem  then  becomes  an  eigenvalue 
problem  and  the  first  k  eigenvectors  of  L  are  expected  to  be  approximations  to  the  indicator 
functions  of  the  optimal  partition  S\, . . . ,  Sk.  There  is  often  no  guarantee  of  the  quality  of  this 
approximation  though  [ST96,  GM98,  KVVOO,  KVV04,  ST07,  vL07]. 

In  order  to  turn  the  approximations  of  the  indicator  functions  into  true  indicator  functions 
a  method  like  thresholding  or  k- means  clustering  is  often  used  [NJW02],  For  partitioning  the 
nodes  into  two  subsets  (k  =  2),  the  potential  term  in  the  Allen-Cahn  equation  (discussed 
in  Section  5)  can  be  interpreted  as  a  nonlinear  extension  of  the  graph  Laplacian  eigenvalue 
problem  which  forces  the  approximate  solutions  to  be  close  to  indicator  functions.  In  this  light, 
it  is  interesting  that  the  graph  Ginzburg-Landau  functional  (with  possibly  a  mass  constraint), 
of  which  the  Allen-Cahn  equation  is  a  gradient  flow,  T-converges  to  the  graph  cut  functional 
[vGB12], 


3  Curvature  and  mean  curvature  flow  on  graphs 

3.1  Graph  curvature 

In  the  continuum  case,  the  mean  curvature  is  given  by  (minus)  the  divergence  of  the  normal 
vector  field  on  the  boundary  of  the  set  (see  Appendix  A. 2  and  (46)).  This  normal  vector  field 
achieves  the  supremum  in  the  definition  of  total  variation  of  a  characteristic  function  (under 
sufficient  smoothness  conditions).  Hence,  we  define  the  normal  of  a  vertex  set  by  using  tpa  from 
(1)  which  achieves  the  supremum  of  the  anisotropic  total  variation. 

10Confusingly,  the  ratio  cut  is  sometimes  also  called  average  cut,  and  the  Cheeger  cut  is  sometimes 

called  ratio  cut. 


12 


Definition  3.1.  The  normal  of  the  vertex  set  S  C  V  is 


v ij  ■=  sgn((Vxs)ij)  =  < 


1  if  (jJij  >  0,  i  E  Sc,  and  j  E  S, 

—  1  if  ooij  >  0 ,j  E  <SC,  and  i  E  S, 
0  else. 


(13) 


As  in  the  continuum  case,  we  define  the  curvature  of  a  set  as  the  divergence  of  the  normal 
Definition  3.2.  The  curvature  of  the  vertex  set  S  C  V  at  node  i  E  V  is 

E, 


(Kq/)i  ■=  ( divub)i  =  di 


;jeSc  uij  ifi€S, 


jes^ij  ifi£Sc. 


(14) 


Recall  from  (1),  that  is  not  uniquely  determined  on  {(i,j)  E  E:  (Vxs)ij  =  0}  ( i.e away 
from  the  boundary  dS  U  d(Sc),  in  the  sense  of  Definition  2.3)  and  hence  the  value  0  in  (13) 
is  a  choice  corresponding  to  the  extension  of  the  normal  field  away  from  the  boundary.  This 
ambiguity  is  irrelevant  when  div  vs  is  coupled  to  the  characteristic  function  xs  via  the  V-inner 
product,  as  in 

T  Vl(xs)  =  (4’r»Xs)v,  (15) 

but  care  should  be  taken  when  trying  to  interpret  the  normal  or  the  curvature  outside  this 
setting. 

Note  that  for  q  =  l  and  S  C  V,  |(/%’r)i|  <  d\~r  for  all  *  E  V.  Also,  {nqgr,xv)v  = 
(is,  grad  xv)v  =  0.  The  curvature,  nqf' ,  has  the  property  that  it  vanishes  away  from  the 
boundary  dSUd(Sc).  In  particular,  we  see  that  the  above  mentioned  ambiguity  is  also  irrelevant 
for  pairings  with  XSC  since 


(Kqsr’XS°)v  =  {Kgr,Xv)v  -  (« sr,Xs)v  =  -TVKxs). 

Let  S,  S  C  V  be  two  node  sets,  then  (15)  implies 

TV«(xs)  -  TVfc)  =  (nf,xs)v  ~  (Kqsr,Xs)v 

=  (4’r  +  Kqf,Xs  -  xs)v  +  (4’r,  Xs)v  -  (Kqsr,  Xs)v. 
For  the  last  two  terms,  we  compute 


(16) 


<<\xs>v-(4’Vxs>v  = 


E  E~EE~E  E+EE 

ieSnSjeSc  ieS\S j&S  ieSnS^Sc  ieS\S^s_ 


u. 


v 


EE+EE-E  E  -  E  E-E 

ieS\S  jeS  je§  ieS  je§nS  ieSrS  \feSc 


u. 


E 

E-E-E  +  E 

4 

-  E 

E-E 

«eSnS 

jeS  ies  jeSc  jeSc. 

ieSnS 

jev  j&v_ 

4  =  °» 


and  thus 


TVa(X^)  -  TVa(xs)  =  X5  -  XS)V- 


(17) 
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In  particular,  if  S  =  S  \  {n}  for  a  node  n  £  S,  then 

TV^(XS\{n})  -  T Vl(xs)  =  -(KS\{n}  +  4’r>X{n}>V  =  E  ~  WL  “  E  Unj 

ieS  jeSc 

-(e-eW 

\jes  j£Sc  J 

Because  we  assume  there  are  no  self-loops,  in  the  final  equality  ujnn  =  0.  A  similar  computation 
shows  for  n  G  Sc 


TV^(xsu{n})  -  TVl(xs)  =  (  E  -  E 

<jesc  jeSj 


^ nj  ^ nn 


E-E 

j&s. 


uj, 


nj 


The  preceding  discussion  implies  the  following:  if  fl  C  V  is  such  that  S  minimizes  TV|(xs) 
among  all  sets  S'  C  V  such  that  SAS'  C  f2,  then  we  have  that 


(18) 


jj&Sc )  unj  -  0  if  n  6  s  n  n, 


(compare  with  the  nonlocal  mean  curvature,  (23)).  Here,  fl  is  a  set  where  S  and  S'  are  forced 
to  agree  (similar  to  enforcing  a  boundary  condition  in  the  continuum  case).  The  two  inequality 
conditions  in  (18)  are  opposite  for  the  two  sides  of  the  interface  between  S  and  Sc.  This  strength¬ 
ens  the  heuristic  idea  that  the  ‘real’  interface,  where  there  would  be  an  equality  condition,  is 
lost  due  to  the  lower  bound  on  the  accessible  length  scales  on  a  graph. 

Remark  3.3.  It  is  interesting  to  make  some  connections  between  the  graph  total  variation 
TVa  and  graph  curvature  K,]jr  on  the  one  hand,  and  the  local  clustering  coefficient  [WS98]  on 
the  other.  Assume  G  is  unweighted  (and  undirected,  as  per  usual  in  this  paper),  then  the 
clustering  coefficient  Ci  of  node  i,  is  the  number  of  triangles  node  i  is  part  of,  divided  by  the 
number  of  possible  triangles  in  the  neighborhood  of  i,  i.e.,  Ci  =  d  _1-) ,  where 

Ti  :=  {{i,j,k}  :  (■ i,j ),  (*,  k),  ( k,i )  €  E}. 

(A  version  for  weighted  graphs  was  introduced  in  [BBPSV04,  Formula  5].)  Using  (2)  and  (16), 
we  can  rewrite,  for  r  =  1, 


Ci  = 


1 


di^di  j,heAfi 


E  U3h 


1 


di(di  1) 


E  E  ^  -  E  E  ^ h 

jeAfp  heNi 


di(di  -  1) 

1 

dj( dj  1) 


E  4  -  TVa(x.vJ 


Kjev  heJVi 

1 


.  heJVi 


di(di  1) 


(vol  Mi  -  TVa(xM)) 


(XA fi  1  XAfi  + 


As  for  the  continuum  case,  we  can  arrive  at  the  graph  curvature,  KgT ,  in  several  ways.  We 
discuss  some  of  them  below.  However,  the  analogy  with  the  continuum  curvature  becomes 
even  more  apparent  if  instead  of  the  standard  mean  curvature,  we  consider  the  nonlocal  mean 
curvature  (see  Section  3.2). 
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If  we  (formally)  compute  the  first  variation  of  the  continuum  TV(«)  over  all  functions 
u  €  BV  of  bounded  variation  (and  not  restricting  ourselves  to  characteristic  functions  only),  we 
find  div  the  curvature  of  the  level  sets  of  u.  In  (28)  and  Appendix  C  we  follow  a  similar 
procedure  on  the  graph  and  again  find  di vipa,  with  ipa  from  (1). 


Similarly,  in  the  continuum  case,  an  alternative  definition  of  the  mean  curvature  is  div  ^  |Vxg| 
which  is  a  Radon  measure  defined  on  the  boundary.  However,  |Vys|  =  1  on  the  boundary  of  S, 
so  that  the  mean  curvature  is  simply  A xs-  As  long  as  S  is  a  rectifiable  set,  this  computation 
can  be  made  rigorous  in  the  context  of  BV  functions  (see  [EG92,  Chapter  5]).  Computing  the 
analogous  quantity  on  a  graph,  we  find 


(Axs)i 


S  uij 


d~ 


j&s* 

—  uij 
j£S 


if  l  £  S, 
if  i  £  Sc. 


(19) 


This  is  equal  to  Kgr  (compare  with  (3)).  The  choice  q  =  1  is  a  natural  one,  because  it  corresponds 
to  the  T-limit  of  the  graph  Ginzburg-Landau  functional  (GL£)  [vGB12],  whose  definition  is  given 
in  Section  5. 

In  the  continuum  case  the  mean  curvature  k(x)  at  the  point  x  £  dS  in  the  boundary  of  a 
set  S  £  satisfies  the  property  that,  if  S  is  smooth  enough,  then  for  any  given  ball  B$(x)  of 
radius  <5  and  center  x  £  dS , 


| Bs(x)  n  S[  -  \\Bs(x)\  =  k52\Bs(x)\  +  o(S2\Bs(x)\). 


Note  that  if  S  were  a  half  space,  then  the  expression  on  the  right  is  zero  for  all  5,  since  dS 
would  separate  each  B$(x)  in  sets  of  equal  volume.  Thus  n  measures  how  much  dS  deviates 
from  cutting  B$(x)  in  sets  of  equal  volume.  The  analogous  computation  on  a  graph,  replacing 
the  ball  by  the  set  A/)  :=  Mi  U  {i}  of  neighbors  of  node  i  together  with  node  {i},  gives,  for 
5  C  V, 


vol  (Mi  n  S')  -  |vol  Mi=  %  -  l  J2  $  =  \  I 

jeMnS  jeM.  \ieMn5  jeA/ln5cy 

Note  that  if  r  =  0  and  G  is  an  unweighted  graph,  such  that  Wy  £  {0, 1},  then 


dr 

ar 


vol  (Mi  n  S)  —  ivol  Mi  =  \ 


E 

VjeMnS 


E 


■4  =  3  (E-E 

KjeS  j£Sc 


in  whose  right  hand  side  we  recognize  (18).  In  the  last  equality  we  used  that  ujtj  =  0,  if  j  ^  Mi. 

The  equality  in  (19)  has  an  interesting  consequence  on  the  level  of  T-convergence  of  function¬ 
als.  For  more  information  about  the  theory  of  T-convergence,  we  refer  to  [DM93,  Bra02].  The 
first  result  of  T-convergence  for  the  Ginzburg-Landau  functional  goes  back  to  work  of  Modica 
and  Mortola  [MM77,  Mod87]. 

Theorem  3.4.  Let  g  £  C(M.n),  e  >  0,  W  £  C2(M)  a  nonnegative  double  well  potential  with 
wells  at  0  and  1,  and  consider  the  functionals  fe,  fo  :  V  — >  K,  defined  by 

fe(u)  ■=  +  “ 


f  (  x  g{(Ksr)i)  ifu  =  xs  for  some  S  CV, 

|  +oo  else. 


15 


p 

Then  fe  —>  fo  as  e  — >  0  (using  any  of  the  equivalent  metrics  on  M.n). 

Furthermore,  if  the  double  well  potential  W  satisfies  a  coercivity  condition  — i.e.,  there  exists 
a  c  >  0  such  that  for  large  |tt|;  W(u)  <  c(u2  —  1) —  then  compactness  holds,  in  the  following 
sense:  Let  {£n}i(f=1  C  M+  be  such  that  en  — >  0  as  n  — >  oo,  and  let  {un}^=1  be  a  sequence  such 
that  there  exists  a  C  >  0  such  that  for  all  n  £  N  f£n(un )  <  C.  Then  there  exists  a  subsequence 
{uri}(f/=  i  C  {m„}“=1  and  a  Uoc  of  the  form  Uoc  =  xs>  for  some  S  C  V,  such  that  un >  — >  u0 0  as 
n  — >  oo. 


Proof.  The  key  point  in  the  proof  of  the  T-convergence  is  to  note  that  f£  is  a  continuous 
perturbation  of  the  functional  w£  :  V  — >  M, 

we{u)  :=  -  ^2  W{ui). 

£  i&V 


By  [vGB12,  Lemma  3.3]11  w£  — >•  wq  as  e  — »  0,  where 


wq(u)  := 


0  if  u  =  xs  f°r  some  S  C  V, 
+oo  else. 


By  a  well  known  property  of  T-convergence  [DM93,  Proposition  6.21],  the  T-limit  is  preserved 
under  continuous  perturbations.  Then  using  the  fact,  shown  above  in  (19),  that  A u  =  K$r  if 
u  =  xs,  completes  the  proof  of  T-convergence. 

The  compactness  result  is  a  direct  adaptation  of  the  proof  of  [vGB12,  Theorem  3.2]  to  the 
current  functionals  f£.  □ 


Remark  3.5.  Note  that  in  Theorem  3.4  above,  we  can  also  use  the  double  well  potential  W 
with  wells  at  ±1,  instead  of  W.  In  that  case,  the  limit  functional  mo  in  the  proof  takes  finite 
values  only  for  functions  of  the  form  u  =  \g  —  xsci  for  some  S  C  V.  Because  A  (xs  +  XSc )  = 
Axy  =  0,  we  have  A(\g  —  xsc)  =  2 Kgr  and  hence  the  limit  functional  /o  takes  the  form 


fo(u) 


Yliev  5f(2(%’r)i)  if  u  =  xs  for  some  S  C  V, 
Too  else. 


We  end  this  subsection  with  another  similarity  between  the  graph  based  objects  we  in¬ 
troduced  and  their  continuum  counterparts.  The  gradient  of  the  graph  distance  dds,  from 
Definition  2.2,  agrees  with  the  normal  v,  from  (13),  on  the  boundary  of  S  induced  by  the  graph 
distance,  in  the  sense  of  the  following  lemma.  This  again  corresponds  to  what  we  expect  based 
on  the  continuum  case.  We  define  the  signed  distance  to  dS  as 

s<f8S:=(xs;-XS„)<i8S  (20) 

Lemma  3.6.  Let  S  C  V.  Define  the  exterior  boundary  of  S  induced  by  the  graph  distance  dds 
as 

dextS  :=  {i  G  Sc  :  3j  £  dS  such  that  dfs  = 

Let  i  £  dextS ,  then  there  is  a  j  £  dS  such  that  (V sdds)ij  =  —Vij- 

Similarly,  let  the  interior  boundary  of  S  induced  by  the  graph  distance  be 

dintS  :=  {*  £  S  :  3j  £  dS  such  that  d^S  =  <JD ' } • 

If  i  £  dintS,  then  there  is  a  j  €  dS  such  that  (Vsd9s)ij  =  Vij. 

11Note  that  in  the  statement  and  proof  of  Lemma  3.3  in  [vGB12],  it  says  ge  twice  where  wE  is  meant. 
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Proof.  First  we  note  that,  for  i  E  dextS  C  5C,  we  have  sdas  =  das .  Because  das  satisfies 
equation  (13)  (with  S  replaced  by  <95),  we  have 

min  (VddS)ik  =  min  ^(df  -  df)  =  -1. 

K  GJv i  k  £./v i 

Note  that  dextS  C  9(5C).  Hence,  i  E  <9(5C )  and  thus  there  is  a  A:  E  A/}  such  that  k  E  95  and 
therefore  =  0.  Because  das  is  nonnegative  and  i  E  dextS,  we  deduce 

i  1— <Jj9S  1—9  9—1 

1  =  maxw-,  ctj  =  rnaxw  .  ‘ujf  . 
keJVi  lk  keMi  lk  11 

Thus  the  maximum  is  achieved  for  k  =  j,  and  hence  so  is  the  minimum  in  (13),  which  shows 
that 

(' VddS)ij  =  -1  =  -Vij. 

The  proof  for  i  E  9jnt5  follows  from  similar  arguments,  noting  that  sdas  =  —das.  □ 

Note  that  dextS  C  <9(5C),  but  equality  does  not  necessarily  hold.  If  a  shortest  path  from 
i  E  <9(5C)  to  dS  does  not  equal  for  some  j  E  95,  then  i  0  dextS.  This  situation  does 

not  occur  if  the  graph  distances  are  consistent,  in  the  following  sense:  if,  for  all  i,j,k  E  V, 
w?”1  <  uj’jjjf 1  In  that  case,  dextS  =  9(5C). 

3.2  Relation  with  the  continuum  nonlocal  mean  curvature 

There  is  a  clear  analogy  between  the  expressions  in  (18)  and  the  (continuum)  nonlocal  mean 
curvature  [CS10,  CRS10],  as  well  as  between  TV|  and  continuum  nonlocal  energy  functionals. 
Consider  an  interaction  kernel  K  :  x  — >  [0,  +oo)  with  I\(x,y )  =  K(y,x)  and 

sup  /  minjl, \x  —  y\2}K(x,  y)  dy  <  +°o. 

ieK"  J Rn 

This  kernel  K  can  be  thought  of  as  the  energy  given  by  a  long-range  interaction  between  a 
particle  placed  at  x  with  a  particle  at  y.  It  defines  a  functional  on  subsets  5  C  Mn,  sometimes 
called  “nonlocal  perimeter”  or  “nonlocal  energy”  and  it  is  given  by 

Jk(S)  =  Lk(S,Sc), 

where  for  any  pair  A,  B  C  W1  we  write 

Lk(A,  B)  =  [  [  K(x,y)  dxdy.  (21) 

J  A  Jb 

Compare  this  with  the  graph  case,  if,  for  A,BcV,  we  write 

LG(A,B)  =  '£'£u;«j.  (22) 

ieAjeB 

In  terms  of  this  bilinear  functional,  the  anisotropic  total  variation,  defined  in  (2),  can  be  rewrit¬ 
ten 

TVa(xs)  =  Lc(S,Sc). 

We  see  that  (21)  is  nothing  but  the  continuum  version  of  (22),  and  one  may  rightfully  interpret 
the  weight  matrix  as  an  interaction  kernel  between  pairs  of  nodes  in  G  and  Lc(S,Sc )  as 
measuring  the  total  “interction  energy”  between  5  and  Sc. 
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Now  suppose  that  S  C  Mn  minimizes  Jk(S)  in  some  domain  0,  meaning  that  if  S'  is  such 
that  SAS'  CC  Q  then  Jr(S)  <  Jk{S').  In  this  case  one  can  see  that  the  following  two 
conditions  must  hold 

/  L(A,S)-L(A,SC\A)  <0,  VAcScnn, 

\  L(A,  S  \  A)  -  L{A,  Sc)  >0,  VAcSTin.  1  j 

If,  arguing  heuristically,  we  let  A  shrink  down  to  any  x  E  ( dS )  n  fl,  we  hnd  that 


(xsiy)  -  xs°(y))  K(x,y)  dx  =  o. 


The  integral  on  the  left,  which  is  well  defined  in  the  principal  value  sense12  when  x  E  dS  and 
dS  is  smooth  enough  (C*2  suffices),  is  known  as  the  nonlocal  mean  curvature  of  S  at  x  with 
respect  to  K,  or  just  nonlocal  mean  curvature  of  S  at  x,  when  K  is  clear  from  the  context.  As 
with  Jk  and  TVf,  we  see  that 


^nonlocal  (pd) 


{xsiy)  -  xsc{y))  K{x,y)  dy 


is  exactly  a  continuum  analogue  of  the  quantity  —  J2j£Sc)  unj  ™  (18),  moreover,  the  in¬ 
equalities  in  (23)  are  a  continuum  analogue  of  those  in  (18).  Note  however,  that  ( ~  unj 

is  defined  for  all  of  n  E  V,  whereas  Knoniocai(x)  above  is  only  defined  when  x  E  dS  and  dS  is 
smooth  enough. 

Known  regularity  results  deal  mostly  with  the  case  Ks{x,  y )  :=  cn>s \x  —  y\~n~s  for  s  E  (0, 1) 
[CRS10,  CG10].  It  is  worth  noting  that  Jxa  is  a  fractional  Sobolev  norm  of  the  characteristic 
function  of  S 

Jks(S)  =  2  llxsll  j^s/2  j 

where  ||.||  fjs/2  Is  defined  in  terms  of  the  Fourier  transform  of  /  by 


ll/lljf./a  =  II  l?|S/(0  IIl2(R")- 


Moreover,  as  s  — >  1~  the  quantity  above  gives  the  perimeter  of  S,  and  the  corresponding 
nonlocal  mean  curvature  converges  pointwise  to  the  standard  mean  curvature.  In  [CS10]  it  is 
shown  that  if  we  consider  the  MBO  scheme,  where  instead  of  the  heat  equation  we  use  the 
fractional  heat  equation,  then  in  the  limit  we  get  a  set  St  evolving  over  time  with  a  normal 
velocity  at  x  E  dSt  given  by 


V{x)  =  cn,s 


xs{y)  -  xs*{y)  , 

\x  -  y\n+s  V 


(24) 


3.3  Mean  curvature  flow 

In  this  section,  we  define  a  mean  curvature  flow  on  graphs  and  connect  it  with  the  curvature 
Kqg  in  (14).  It  is  not  clear  what  is  the  most  natural  notion  for  the  evolution  of  a  phase  in  a 
graph.  Do  we  want  to  consider  a  sequence  of  subsets  {S'njneN,  or  a  continuous  family  {St}t>o 
which,  although  piecewise  constant  in  t,  may  change  in  arbitrarily  small  time  intervals?  How  we 
connect  solutions  of  the  graph  mean  curvature  flow  to  solutions  of  the  graph  (MBOT)  scheme 

12Note  that  K(x,y)  could  have  a  very  strong  singularity  at  x  =  y  making  the  integral  diverge,  if  taken  as  a 
Lebesgue  integral.  The  boundedness  of  the  principal  value  of  this  singular  integral  tells  us  that  dS  must  have 
some  smoothness  near  x. 
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from  Section  4,  or  to  solutions  of  the  graph  Allen-Cahn  equation  (ACEe)  from  Section  5,  will 
depend  on  the  answer  to  this  question.  For  now,  we  shall  be  content  with  considering  a  phase 
evolution  comprised  of  a  discrete  sequence  of  sets  Sn  =  S'(nSt),  n  G  N,  that  correspond  to  the 
state  of  the  system  at  discrete  time  steps. 

Our  construction  follows  the  well-known  variational  formulations  for  classical  mean  cur¬ 
vature  flow  [ATW93,  LS95].  Appendix  A. 2  has  a  brief  overview  of  mean  curvature  and  the 
associated  flow  in  the  continuum  case.  An  obstacle  one  encounters  when  trying  to  emulate  the 
continuum  level  set  method  to  express  mean  curvature  flow  on  graphs  is,  that,  due  to  the  lack 
of  a  discrete  chain  rule,  the  resulting  equation  is  not  independent  on  the  choice  of  level  set 
function. 

Recall  the  notions  of  graph  distance  and  boundary  of  a  node  set  from  Section  2. 

Definition  3.7.  The  mean  curvature  flow,  Sn  =  S(ndt),  with  discrete  time  step  3 t  for  an  initial 
set  So  C  V,  is  defined 

Sn+ 1  e  arg  min  iF(S,  Sn),  (MCFgt) 

Scv 

where 

T(S,  Sn)  :=  TVl(xs)  ~  TV%(xSn)  +  ^(Xs  ~  XSn ,  (x§  ~  XSn)<Fn)v  (25) 

and 

Fn  :=  dSn  U  d(Sff)  =  {i  G  V  :  3  (i,  j)  G  E  such  that  (i  G  Sn  A  j  G  Sf)  V  (*  G  S°  A  j  G  Sn)}. 

Note  that,  for  a  given  graph  G,  minimizers  of  J-  may  not  be  unique.  In  this  case  different 
mean  curvature  flows  can  be  defined,  depending  on  the  choice  of  S^+i-  An  example  of  this 
non- uniqueness  on  the  4-regular  graph  is  given  in  Section  6.5. 

We  choose  to  use  the  distance  to  En,  instead  of  the  distance  to  either  dSn  or  d(S!f),  so  that 
the  mean  curvature  flow  is  not  a  priori  (independent  of  curvature)  biased  to  either  adding  nodes 
to  or  removing  nodes  from  S. 

Since  nodes  in  T,n  can  be  added  or  removed  from  S  without  increasing  the  last  term  of 
(25),  every  stationary  state  x.S  of  (MCFgt)  is  a  minimal  surface  in  the  sense  that  TV|(xs)  < 
TVa(X{Su{n})  for  n  €  d(Sc )  and  TVf(xs)  <  TV|(x{s\{n})  for  n  <E  dS  (in  the  case  where  the 
minimizer  of  T  is  unique,  the  inequalities  are  strict).  In  particular,  the  sequence  defined  by 
(MCFgt)  is  not  “frozen”  for  St  arbitrarily  small. 

Remark  3.8.  In  the  last  term  of  J-(S,Sn),  we  use  a  symmetrized  distance  to  the  boundary, 
(Xs  ~  XS„ >  (,\'5  —  xsn)d^)v.  Other  choices  are  possible  here,  including  \\xscds  —  XsdSL  ||2.  For 
this  alternative  choice,  (MCFgt)  exhibits  “freezing”. 

We  can  rewrite  the  last  term  in  iF(S,  Sn)  in  terms  of  the  signed  graph  distance 

stfW  ,=  —  XSn)d^n , 

(compare  with  (20)),  which  takes  nonnegative  values  in  Srn  and  nonpositive  values  in  Sn.  We 
state  the  precise  result  in  the  following  lemma. 

Lemma  3.9.  argmin iF(S,  Sn)  =  argmin iF'^S,  Sn),  where 
scv  scv 

F\S,Sn)  :=  TVliXs)  ~  TKiXsJ  +  (26) 

=  (« Y  +  KsliXs  -  Xsn)v  +  sdEn)v. 
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Proof.  The  rewriting  of  TV|(xg)  —  TV|(xsn)  in  terms  of  the  curvatures,  follows  directly  from 
(17).  For  the  distance  term,  we  compute 

{(Xs  ~  XSn)2,  gF")v  =  (X§(1  -  2XSn)  +  XSn,dEn)v 

=  (x§,  (XS-  ~  Xs)dE")v  +  ( XSn,dEn)v , 

where  in  the  last  line  we  used  that  1  —  xsn  =  XS-  ■  The  proof  is  completed  by  noting  that  the 
last  term  above  does  not  depend  on  S.  □ 

Theorem  3.10.  Let  u  E  V  be  a  minimizer  of  the  convex  functional 

F{u)  :=  TVl(u)  +  l(u,Sds™)v,  (27) 

then,  for  a.e.  s  E  M,  the  superlevel  set 


E(s)  :=  {i  €  V  :  U{  >  s} 
is  a  minimizer  S  of  F(-,Sn)  from  (25). 

Proof.  In  Lemma  B.l  in  Appendix  B  we  show  that 

TVJM  =  1 

i,jev 

Writing  m  =  fR  ( Xe(s)\  ds,  as  in  (48),  we  also  see  that 


\Ui  -  Mi  =  - 


2  JR  U^w)< -  Us(«)), 


ds 


(ti,sds")v  =  /  (xE(s),sdPn)vds. 

Jr 


This  gives 
F(u)  = 


TVa(X'E(s))  +  g -l\E(s),^S”)v 


ds=  [  [F\E(s),Sn)-TVl(Sn)]  ds, 
Jr 


where  T'  is  as  in  (26).  Hence,  if  u  minimizes  F,  then  a.e.  superlevel  set  E(s )  minimizes  T' (■,  Sn). 
Lemma  3.9  now  completes  the  proof.  □ 


Remark  3.11.  The  function  TV|(u)  is  a  convex  function,  thus,  taking  <9T V|(u)  as  the  (possibly 
multivalued)  subdifferential  of  TV®(a)  [ET76],  any  minimizer  of 

u  i  y  TV«(u)  +  (u,g)v, 

for  g  S  V,  will  solve  the  differential  inclusion 

3TVl(u)  3  -g. 

From  the  form  definition  of  TV®  we  see  that  its  subdifferential  is  only  multivalued  for  it’s  such 
that  X7u  vanishes  at  some  node,  at  all  other  u,  TV®  is  pointwise  differentiable.  In  particular,  if 
u,  v  E  V  and  Vu  is  never  zero,  we  may  differentiate13 


d_ 

dt\t=o 


TVl(u  +  tv) 


dt\t=o 


h^^ijlui-Uj+tivi 


*,3 


h^2UijSSn(ui  ~  uj)(vi  ~  vj)i 


V 


(sgn(Vrt),  Vn). 


13For  further  discussion  and  a  generalization  of  this  computation,  see  Appendix  C. 


(28) 
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Since  div  is  the  adjoint  of  V,  it  follows  that,  for  all  v  E  V, 

^  (TVaO“  +  tv )  +  (u  +  tv,  g)v)  =  (div  (sgn(Vit))  +  g,  v)v. 

Therefore,  the  Euler-Lagrange  equation  for  minimizers  of  F  from  (27),  is 

divsgn(Vu)  +  ^sdPn  =  0) 

provided  Vtt  is  never  zero.  Whenever  (' Vu)ij  =  0  for  some  i,j,  the  above  equation  is  replaced  by 
a  differential  inclusion  in  terms  of  the  subdifferential  of  the  absolute  value  function.  Concretely, 
since  the  subdifferential  of  the  absolute  value  function  at  0  is  the  interval  [—1,1],  there  exists 
4>  G  V  such  that  <  1  for  every  node  i  E  V,  and 

1  V 

div  <p  +  — sdrn  =  0. 
at 

Fast  computational  methods  for  the  solution  of  (MCFgt)  based  on  max  flow/min  cut  algo¬ 
rithms  are  developed  in  [CD09].  These  methods  exploit  the  homogeneity  and  submodularity  of 
the  total  variational  functional,  TV|. 

4  Threshold  dynamics  on  graphs 

In  this  section  we  study  the  threshold  dynamics  or  Merriman-Bence-Osher  algorithm  on  a  graph 
G.  For  a  short  overview  of  the  continuum  case  we  refer  to  Section  A. 3  in  Appendix  A. 

4.1  The  graph  MBO  algorithm 

The  MBO  scheme  on  a  graph,  describing  the  evolution  of  a  node  subset  S  C  V,  is  given  as 
follows. 


Algorithm  (MBOr):  The  Merriman-Bence-Osher  algorithm  on  a  graph. 

Data:  An  initial  node  subset  So  C  V,  a  time  step  r  >  0,  and  the  number  of  time  steps 
N  >  0. 

Output:  A  sequence  of  node  sets  {•S’fcjfc Li>  which  is  the  (MBOr)  evolution  of  So- 

for  A;  =  1  to  N ,  do 

Diffusion  step.  Let  v  =  e~Arxsk-1  denote  the  solution  at  time  t  of  the  initial  value 
problem 

v  =  -Av,  v(0)  =  xsk_1-  (29) 

Here  xs  denotes  the  characteristic  function  of  the  set  S. 

Threshold  step.  Define  the  set  C  V  to  be 

Sk  =  {i  G  V:  Vi  >  ^}. 


By  the  comparison  principle,  Lemma  2.5(d),  we  note  that  the  solution  to  (29)  satisfies 
v(t)  €  [0,  l]n  for  all  t  £  [0,  r\. 
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Remark  4.1.  In  the  thresholding  step  of  the  (MBOr)  scheme,  we  have  arbitrarily  chosen  to 
include  the  level  set  {i  €  V  :  Vi  =  hr  the  new  set  Sk,  he.,  the  value  at  nodes  i  for  which 
Vi  =  ^  is  set  to  1. 


An  alternative  description  of  the  algorithm  is  as  follows.  Let  Uk  be  the  indicator  of  the  set 
Sk  as  defined  by  the  (MBOr)  algorithm.  If  we  define  the  thresholding  function  P  :  M  — >  {0, 1}, 
which  acts  by  thresholding 


P(x) 


1  if  x  >  \ 
0  if  x  <  \ 


then  the  iterates  can  be  succinctly  written  Uk  =  (Pe~^T)kUQ. 

Several  papers  use  the  MBO  algorithm  on  a  graph  to  approximate  motion  by  mean  curvature. 
For  example,  in  [MKB12,  GCMB+13,  HLPB13],  the  MBO  algorithm  on  graphs  was  implemented 
and  used  to  study  data  clustering,  community  detection,  segmentation,  object  recognition, 
and  inpainting.  This  is  accomplished  by  simply  reinterpreting  the  Laplacian  in  (47),  or  in 
appropriate  extensions  of  (47),  as  the  graph  Laplacian. 


4.2  The  “step-size”  r  in  the  (MBOT)  algorithm 

As  discussed  at  the  end  of  Section  A. 3,  in  a  finite  difference  discretization  of  the  continuum 
MBO  algorithm,  the  time  step  r  must  be  chosen  carefully  to  avoid  trivial  dynamics.  For  r  too 
small,  there  is  not  enough  diffusion  to  change  the  value  of  u  at  neighboring  grid  points  beyond 
the  threshold  value.  In  this  case,  the  solution  is  stationary  under  an  (MBOr)  iteration  and 
we  say  that  the  solution  is  frozen  or  pinned.  For  t  too  large  there  is  so  much  diffusion  that 
a  stationary  state  is  reached  after  one  iteration  in  the  MBO  scheme.  It  is  not  surprising  that 
these  finite  difference  effects  also  appear  for  the  MBO  algorithm  on  graphs.  From  the  form 
of  the  heat  solution  operator,  eAr,  we  expect  that  r  should  be  roughly  chosen  in  the  interval 
(A”1,  Xf 1).  Theorems  4.2  and  4.3  strengthen  this  intuition. 

The  following  theorem  gives  a  lower  bound  on  the  choice  of  r  to  avoid  freezing  in  the  (MBOr) 
algorithm  on  general  graphs. 

Theorem  4.2.  Let  p  be  the  spectral  radius  of  the  graph  Laplacian,  A.  Then  the  (MBOr) 
iterations  on  the  graph  with  initial  set  S  are  stationary  if  either  of  the  two  conditions  are 
satisfied: 


t  <  tp{S)  :=  p  Mog  ^1  +  ^ 
or 

t  <  tk(S )  :=  1  - . 

2||Axs||v,oo 

r 

In  particular,  since  vol  S  >  df,  if  r  <  log  |  ■  p~l  «  0.4 
are  pinned  for  any  initial  S  C  V . 


dl  (vol  S)~  5 


(30) 


(31) 


■  p  1 ,  then  (30)  implies  the  MBO  iterates 


Proof.  To  prove  (30),  let  xs  be  the  characteristic  function  on  a  set  S  C  V.  For  a  node  to  be 
added  or  removed  from  S  by  one  iteration  of  (MBOr),  it  is  necessary  that  ||e_rAxs  —  Xs||v,oo  > 
2-  For  the  linear  operator  A:  V  — >  V,  let  ||A||y  be  the  operator  norm  induced  by  ||  •  ||y,  Le., 


||A||v  =  max 
«ev\{0} 


Niv 
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(see  also  Lemma  2.4(b)).  Using  Lemma  2.1,  we  compute 


lie  rAXS  -  xslkoo  <  d_2  || e  rAXS  ~  Xs\\v  <  d_2  ||e  rA  -  Id||v  Vvol  S. 

Using  the  triangle  inequality  and  the  submultiplicative  property  of  ||  •  ||v  (see,  e.g.,  [H J90] ) ,  we 
compute 

oo  1 

||e-rA  -  Id||v  <  E  M(T||A||v)fc  =  gPT  "  L 

k=  1  K' 

Thus,  if  r  <  p _1  log  (l  +  \  dl_  (vol  5)_2^;  all  nodes  are  stationary  under  an  (MBOr)  iteration. 
To  prove  (31),  we  write  the  solution  to  the  heat  equation  at  time  r, 

u(T)  =  e~rAxs  =  XS  -  [  A u(t)dt. 

Jo 

This  implies 


<  [  ||e  tAAxs||V0O<ft<r||Axs||Vi0O. 

V,oo  J  0 

Here,  we  used  the  comparison  principle,  Lemma  2.5(d).  Thus,  if  r  <  2||A\g||v — ’  ll'u(r)  — 

Xsllv,oo  A  implying  the  (MBOr)  solution  is  stationary.  □ 


I u{t)  -  xs||v,oo  = 


A  u(t)dt 


The  following  corollary  of  Lemma  2.5(c)  shows  that  an  upper  bound  on  r  is  necessary  to 
avoid  trivial  dynamics. 


Theorem  4.3.  Let  the  graph  be  connected.  Consider  the  (MBOr)  algorithm  with  initial  condi¬ 
tion  xs ;  for  ^  node  set  S  C  V.  Assume  Rs  ■=  ((^7  7^  If 


r>Tt:=—  log 
A2 


/  (vo15)2  (vo15c)2  \ 

V(volU)i  \Rs~l\  dlj  ’ 


(32) 


where  the  mass  M(uq )  is  defined  in  (10),  then 


Pe  tAuq 


XV  Rs  >  5, 

0  Rs  <  \. 


Proof.  In  Lemma  2.5(c),  set  e  =  |(vol  V)  1 M (uq) 
-R5II v,oo  A  | Rs  ~  ||,  as  desired.  With  e  =  | Rs  - 
T  >  ijiog  (| Rs  ~  ||-1  d_2  I) u0  -i?s||v)-  For  u0  -- 


-  \\  =  |i?s  —  ||-  This  implies  that  ||u(r)  — 
J | ,  the  condition  on  t  in  Lemma  2.5(c)  is 

XS,  IN  -  -Rsllv  = 


vol, S'  volSc 
voiu 


□ 


This  corollary  shows  that,  if  r  is  chosen  too  large,  one  iteration  of  (MBOr)  leads  to  a  trivial 
state  u  =  xv  or  u  =  0,  which  is  stationary  under  the  algorithm  (MBOT). 

The  following  theorem  gives  a  condition  for  which  there  is  a  gap  between  the  lower  and 
upper  bound  for  r. 


Theorem  4.4.  Consider  the  (MBOr)  iterations  on  a  graph  with  n  >  2.  Let  tp  and  rt  be  defined 
as  in  (30)  and  (32).  If  jfi  <  ~  0.85,  then  rp  <  Tt. 
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Proof.  Since  | Rs  —  \  \  dr_  <  volS1,  and  (vol5)  (vol5'c)  >  dr_(\o\V  —  dr_)  we  have 
Tp  =  Xf 1  log  ^1  +  *  d!  (vol  S)-^  >  Ay 1  log  ^y'  l  -  >  ^2  1  log  y/2. 

Since  dr_  <  volS1,  we  have 

=  a-1  log  (i  +  |\/S)  - A- 1  i°g !- 

The  result  follows.  □ 


Theorem  4.4  further  reenforces  our  intuition  that  r  should  be  chosen  in  the  interval  (A”1,  Xf 1). 
If,  for  a  particular  graph,  this  interval  is  very  small,  then  Theorems  4.2  and  4.3  cannot  provide 
an  interval  for  which  the  (MBOr)  iterations  has  a  chance  of  being  non-stationary  after  the  first 
iteration.  Note  however  that  the  interval  [ rp,Tt ]  given  by  these  theorems,  is  not  necessarily  a 
sharp  interval  for  interesting  dynamics. 


4.3  A  Lyapunov  functional  for  the  graph  (MBOr)  algorithm 

In  this  section  we  introduce  a  functional  which  is  decreasing  on  iterations  of  the  (MBOr) 
algorithm.  The  analogous  functional  for  the  continuum  setting  was  recently  found  in  [E013]. 
The  functional  is  then  used  to  show  that  the  (MBOT)  algorithm  with  any  initial  condition 
converges  to  a  stationary  state  in  a  finite  number  of  iterations. 

Let  r  >  0  and  consider  the  functional  J :  V  — >  M  defined  by 

J(u )  =  (1  —  u,  e-rAu)y-  (33) 

Note  that  by,  Lemma  2.5(a),  J(u)  =  M{u )  —  (u,e~TAu)y,  where  M  is  the  mass  from  (10). 

Lemma  4.5.  The  functional  J  defined  in  (33)  has  the  following  elementary  properties. 

1.  J  is  a  strictly  concave  functional  on  V. 

2.  J  is  Frechet  differentiable  with  derivative  in  the  direction  v  given  by 


where 


6J_ 

5u 


=  1  -  2 e“rAu. 


Proof.  We  compute,  for  all  v  /  0, 

d2 

+  av )  =  — 2 (v,  e-rAw)v  <  0. 

Taking  the  first  variation  of  J{u )  =  (1  —  u,  e~rAu)y,  we  find  that 

(t? ' 5“)v  :=  ^TASu)v  -  (■*“’  e~ra“)v  =  (1~  2e~TA“-  Sv)v ’ 

as  desired. 


□ 
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Define  the  convex  set  /C  :={</>£  V :  Vj  £  V  4>j  £  [0, 1]}.  Is  it  instructive  to  consider  the 
optimization  problem, 

min  J(u).  (34) 

u£K, 

Since  the  objective  function  in  (34)  is  concave  and  the  admissible  set  is  a  compact  and  convex 
set,  it  follows  that  the  solution  to  (34)  is  attained  by  a  vertex  function  u  £  B  :=  {u  £  V :  Vj  £ 
V  Vj  £  {0, 1}}.  Here  B  is  the  set  of  binary  vertex  functions,  taking  the  value  0  or  1  on  each 
vertex.  The  sequential  linear  programming  approach  to  solving  the  system  (34)  is  to  consider 
a  sequence  of  vertex  functions  {rtfc}£T0  which  satisfies 

Uk+i  =  argmin  LUk(v),  uq  =  xs ,  for  a  node  set  S  C  V.  (35) 


The  optimization  problem  in  (35)  may  not  have  a  unique  solution,  so  the  iterates  are  not  well- 
defined.  The  following  proposition  shows  that  the  iterations  of  the  (MBOr)  algorithm  define  a 
unique  sequence  satisfying  (35).  Note  that  the  optimization  problem  in  (35)  is  the  minimization 
of  a  linear  objective  function  over  a  compact  and  convex  set,  implying  that  for  any  sequence 
satisfying  (35),  Uk  £  B  for  all  k  >  0. 

Proposition  4.6.  The  iterations  defined  by  the  (MBOT)  algorithm  satisfy  (35).  The  functional 
J ,  defined  in  (33),  is  non-increasing  on  the  iterates  {uk}(T=i,  i.e.,  J(uk+ i)  <  J{uk),  with  equality 
only  obtained  if  Uk+ 1  =  Uk-  Consequently,  the  (MBOT)  algorithm  with  any  initial  condition 
converges  to  a  stationary  state  in  a  finite  number  of  iterations. 


Proof.  At  each  iteration  k,  the  objective  functional  LUk  is  linear  and  thus  the  minimum  is 
attained  by  a  function 


1  if  1  —  2e  rAUk  <  0, 
0  if  1  —  2 e~rAUk  >  0 


X{e  T^uk>\}- 


These  are  precisely  the  (MBOT)  iterations.  By  the  strict  concavity  of  J  and  linearity  of  LUk, 
for  uk+ i  uk, 


J(uk+ 1)  J{uk)  *£  CUk(uk+ 1  Uk )  LUk{uk+ 1)  LUk(uk). 

Since  Uk  £  /C,  LUk{uk+\)  <  LUk(uk )  which  implies  J(uk+i)  <  J(uk )■  The  convergence  of  the 
algorithm  in  a  finite  number  of  iterations  then  follows  from  the  fact  that  B  contains  only  a  finite 
number  of  points,  the  vertices  of  the  unit  ?r-cube.  □ 

Proposition  4.6  shows  that  J  is  a  Lyapunov  function  for  the  (MBOT)  iterates.  From  the 
proof  of  Proposition  4.6,  we  also  note  that  the  non-uniqueness  of  the  iterates  in  (35)  corresponds 
to  the  choice  in  the  (MBOr)  algorithm  of  thresholding  vertices  [j  £  V :  e~rAUk  =  g}  t°  either 
0  or  1  (see  Remark  4.1). 

Remark  4.7.  The  framework  of  [E013]  easily  allows  for  the  extension  of  the  MBO  algorithm 
to  more  phases,  however  we  do  not  pursue  these  ideas  here. 

4.4  A  local  guarantee  for  a  ‘nonfrozen’  (MBOr)  iteration 

We  begin  by  observing  that  the  constant  rK  in  Theorem  4.2  depends  on  the  maximum  curvature 
of  the  indicator  set  in  the  graph,  ||Axs||v,oo-  la  this  section,  we  prove  a  theorem  which 
gives  a  condition  on  r  in  terms  of  the  local  curvature  (ftl)r)i  at  a  node  i,  which  guarantees  that 
the  value  of  u  on  that  node  will  change  in  one  iteration  of  the  graph  (MBOr)  scheme. 
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We  first  introduce  some  notation  which  is  needed  to  state  the  theorem.  Recall  that  the  set 
of  neighbors  of  a  node  i  E  V  is  Mi  =  { j  E  V  :  uiij  >  0} .  Let  1  E  V  be  an  arbitrary  node  in  the 
graph  G  and  let  S  C  V.  We  define  the  sets 


Si  :  = 


Mnsc 
Mi  ns 


if  l  g  5, 
if  1  0  5, 


and  Si 


SiU{l}. 


The  set  S\  contains  neighbors  of  node  1  which  are  also  in  either  the  boundary  d(Sc)  or  dS 
(depending  on  whether  or  not  1  E  S).  For  u  E  V,  define  A'  as 


(A  'u)i 


di  T  Y.jzsiUiji'H  ~  ui)  if  *  G 
0  if  i  0  Si. 


We  see  that  A'  on  Si  is  similar  to  the  Laplacian  on  the  subgraph  induced  by  Si,  with  the 
important  distinction  that,  for  each  i  E  V,  the  degree  di  is  the  degree  of  i  in  the  full  graph  G, 
not  the  degree  in  the  subgraph  induced  by  Si.  In  [Chu97,  Section  8.4],  A'  is  referred  to  as  the 
Laplacian  with  Dirichlet  conditions  on  d(Si  ).  If  v  E  Vi  :=  {n  €  V  :  v  =  0  on  Si  },  then 


(A  'v)i 


(A v)i  if  i  E  Si, 
0  if  %  0  Si. 


Note  in  particular  that,  if  v  €  Vi,  then  e  tA'v  €  Vi  for  all  t  >  0. 

Theorem  4.8.  Let  1  G  V  be  an  arbitrary  node  and  S  C  V  be  such  that  |(/%’r)i|2  >  ||(A,)2\'51 1| v,oo  - 
If  t  G  (ti,72),  where 

Tl,2  ■=  ||(A/)2^S  ||v  (l(4’r)il  ±  \/l(Ks’r)il2  ~  IKA/)2X5illv,oo^  >  0,  (36) 

then 

\(Pe~rAxs)i  -  (\'s)i|  =  1- 

That  is,  the  phase  at  node  1  changes  after  one  (MBOr)  iteration. 

It  is  important  to  note  that  both  |(«5’r)i|2  and  || (A')2X5i ||v,oo  are  local  quantities,  in  the 
sense  that  they  only  depend  on  the  structure  of  G  at  node  1  and  its  neighbors.  This  is  in  contrast 
with  Theorem  4.2,  which  depends  on  the  spectrum  of  the  Laplacian  on  G.  The  existence  of  a 
lower  bound  7i  on  r  is  unsurprising  in  the  light  of  this  earlier  freezing  result.  The  necessity  for 
an  upper  bound  12  can  be  understood  from  our  wish  to  only  use  local  quantities  in  this  theorem. 

Proof  of  Theorem  f.8.  First  we  assume  that  1  ^  S,  so  Si  =  Mi  H  S.  By  the  comparison 
principle  in  Lemma  2.5(d),  xSi  A  XS  on  V  implies  (e~rAxs1)i  A  (WrAXs)i-  In  particular, 
since  (xsji  =  (xs)i  =  0,  we  have 

(e~ArXS  -  Xs)i  >  {ePrAXS1  -  XSi)i- 

Let  v  satisfy  the  heat  equation  with  Dirichlet  boundary  data, 

(  v  =  -(A 'v)i, 

1  u(0)  =  XS,- 

As  noted  above  the  theorem,  v{t)  E  Vi  for  all  t  >  0. 


26 


It  is  easily  checked  that  v  is  subcaloric,  i.e.,  Vi  <  —(A v)i  for  all  i  G  V,  and  w(0)  <  xs ■  In 
addition,  the  Laplacian  satisfies  —(A u)i  <  —  (Au)i  if  m  =  Ui  and  Uj  <  Uj,  for  j  0  i.  Hence,  by 
the  theory  of  differential  inequalities  (see  for  example  [Sza65,  Theorem  8.1(3)]), 

Vi{t)  <  (e_<Au(0))i  =  (e_iAX5i),: ,  for  all  i  €  V. 

In  particular, 

(e_rAXSi  -XSi)i  >  vi(T)  —  ^i(O)  =  (e~TA'XSl  -xAl^  =  -r  (A'XSl)1  +  r2r(r), 

where 

lr(T)l  <  \  sup  (e"fA'(A')2XSi)  <  ^||(A')2XSillv,oo- 

zte[o,r\  v  /i  2 

Note  that  —  (A'xs^!  =  —  (4’r)1  =  |(4’r)1l’  where  the  last  equality  follows  because  10  5. 

We  conclude  that 

(e~rAXS~Xs)i  >  (e_rAX5i  ~XS1)1  >  l(4’r)ilr  “  ^II(a')2XSi||v,ooT2, 

hence 

(e_TAXS  -  Xs)i  >  2  G  [n,r2], 

which  proves  the  result  for  the  case  in  which  10  5. 

To  prove  the  desired  statement  if  1  E  5,  we  note  that 

(e~rA  -  l)(xs  +  XSC)  =  0, 

So  the  condition  (e~rAXs  ~  Xs)i  <  —  \  is  equivalent  to  {errAXs^  —  X'Sc)i  >  Recall  that, 
in  this  case,  5i  =  N\  n  5C,  and  the  same  derivation  as  above  holds14,  since  1  0  5C,  with  the 
exception  that  the  admissible  range  of  r  becomes  the  open  interval  (71,72).  This  is  because,  by 
our  definition  of  the  (MBOr)  algorithm,  the  thresholding  operator  thresholds  the  ^ -level  set  to 
1. 

□ 


In  the  remainder  of  this  section  we  determine  some  conditions  under  which  the  requirement 
l(4’r)il2  >  ll(A,)2X51||  y,oo  in  Theorem  4.8  is  satisfied.  To  this  end,  define  the  reduced  degrees, 
for  i  G  V,  as 

4  ■■=  Y  (37) 

jeSi 

Lemma  4.9.  |  (^v^.’7')i | 2  >  ||  (A')2xsi  ||v,oo  if  and  only  if 


d1  r{d\Y  >  rnaxd, 

ieSi 


—r 

i 


~di  r4  -  Y  d]  rUij  +  Y  dkrd'kujik 
.leSi  fceSf 


(38) 


Proof.  Consider  the  |V|  x  |V|  matrix  corresponding  to  Ah  After  possibly  relabeling  the  nodes, 

Jj  q  \  _  _ 

it  can  be  written  as  [  j  ,  where  L'  is  the  1 5i  |  x  1 5i  |  matrix  with  entries 


L'ij  =  4 


_r  j  -di  if  i  =  j, 

Uij  if  i  0  j. 


4  In  particular,  carefully  note  that  now  —  (A'xsi)i  =  («s)i  =  |(k^)i|  holds. 
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Then 


(L  2)ij  —  ( L  )ik(L  )kj  —  ( L  )u(L  )ij  +  (L  (L  )ik(L  )kj 

fceST  kesl\{i,j} 


=  dr 


~(d]  T  dj  r)uij  +  dkTUJikUljk 

keSi\{i,j} 


Thus  ((A/)2X51)i  =  0  if  z  0  Si,  and,  for  ieSj,  we  have 


((ATxaJi  =  d~T  £ 

jeSi 

=  «r  £ 

jeSi 


~  r  +  dj  r  )^ij  +  dfc  ruJikUJjk 

keSi\{i,j} 


~~(dj  r  +  dj  r)^ij  +  dkrujikuijk 
fcesT 


=  dj 


‘d;  rd-  -  ^  d*  +  Y,  dkrd'kUik 
ieSi  fcesT 


where,  for  the  second  equality,  we  used  that  uia  =  ujjj  =  0. 

From  (14)  and  the  definition  of  Si,  we  find 

I  /  l,r\  1 2  j— 2r  \  '  r— 2rj/2 

|(«s  )i|  =  d:  2^  =  dx  dl  . 

j,keSi 


□ 


Corollary  4.10.  Let  r  =  1  and  d±  >  0  .If  there  exists  an  0  <  £  <  1,  such  that,  for  all  i  £  Si, 

(39) 


d(  d'i  ,  cun  o.  d\ 

T<e^  and  LT  <  (*  )4 

di  d\  di  d\ 


then  condition  (38)  is  satisfied,  and  there  exist  0  <  ti  <  T2  as  in  Theorem  4-8. 

If  the  first  condition  in  (39)  is  satisfied  with  0  <  e  <  ^  (\/5  —  l)  «  0.618,  then  the  second 
condition  in  (39)  can  be  replaced  by  the  condition  that,  for  all  i  £  Si,  uin  <  d!%. 

Note,  by  (14)  and  (37),  that  the  conditions  in  (39)  can  be  rewritten  as 

(4)1)*  -  1  >  e(4’i1)i  and  («{!})*  >  (!  -  e2)^1)!’ 

for  all  i  £  Si. 

Proof  of  Corollory  4-10.  For  r  =  1,  we  compute,  for  i  £  Si, 


d~r 


-d\  rd'i  -  Y  d)  r“ii  +  Y  dkrd'k“ik 

ieSi  fees! 


di  di  !  dkuik 

di  di  _ dk  di 

fceSi 


\  d'k  ujik 
dk  di 

fceSi 


'  cjf  \  2  d!  (jj- 

hence  condition  (38)  becomes  (  —  )  >  max  >  —  — 1 — .  If  the  first  condition  in  (39)  is  satisfied, 

\dij  iesf  dk  di 


we  have 


kes  i 


d'k  Uik  _  d^  ton  y-\  d'k  LOik  <  d'x  ujh  ^d'k  d[  _  d^ 

—  dk  di  di  di  _  dk  di  di  di  di  di  di 

fees i  fceSi 


ffii,  A 

di  di 
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Since  d\  >  0,  condition  (38)  reduces  to  max 

ieSi  L 


Wii  ,  d'i 

~T  +  £2 

di  di 


<  — .  If  the  maximum  is  achieved  at 
d] 


(jj'\  df  d!  d! 

i  =  1.  then  — ( — b  e-^  =  e—  <  — .  If,  on  the  other  hand,  the  maximum  is  achieved  at  some 
di  di  d\  d± 

uj'i  d'-  id'  i  d!  d1 

i  €  Si,  then  — ( — b  e-1  <  — — b  £2~r  <  ~r-  Here,  we  first  used  the  first  condition  from  (39), 
d{  di  di  d\  d\ 

and  then  the  second.  Combined  with  Theorem  4.8  and  Lemma  4.9,  this  proves  the  first  claim. 
Now  assume,  instead  of  the  second  condition  in  (39),  that  ojh  <  d[  for  all  i  £  S\.  Then 

the  second  condition  in  (39)  is  satisfied.  This 


^  ^  <  £^.  Hence,  if  0  <  e  <  1  —  e2 


di 


'  d\  ‘ 


requirement  is  met,  if  0  <  e  <  \  (\/5  —  l). 


□ 


It  is  worthwhile  to  understand  the  conditions  in  the  corollary.  Adding  the  two  conditions 


The  ratio  di+u’11  jg  a  measure  of  the  relative  strength  of 


in  (39)  gives  <  (1  +  e  —  £2)^. 

—  d' 

connection  of  node  i  within  the  set  S\  compared  to  all  its  connections  in  G.  Similarly  is  the 

relative  strength  of  connection  of  node  1  within  S\  (or,  equivalently,  within  S i),  compared  to  all 
its  connections  in  G.  The  conditions  in  (39)  thus  require  node  1  to  be  a  node  with  comparatively 
large  relative  connection  strength  within  Si,  compared  to  the  other  nodes  in  S\.  This  will  allow 
enough  mass  to  diffuse  to  or  away  from  node  1  (depending  on  whether  or  not  1  G  S'),  for  it  to 
pass  the  threshold  value  without  too  much  of  the  locally  available  mass  diffusing  to  other 
nodes. 

Some  examples  in  Section  6  further  examines  the  conditions  in  (38)  and  (39). 

Remark  4.11.  One  can  interpret  the  condition  |(k^’7)i|2  >  ||(A/)2ys1  ||v,oc  from  Theorem  4.8, 
in  terms  of  the  spectral  radius  of  A'  (similar  to  [Chu97,  Equation  (8.7)]).  We  compute 


p(A')  =  sup 

«ev\{0} 

<  sup 

«ev\{o} 


(u,  A 'u)v  1  E;jesr uv (vi  -  vi) 

- —  -  sup  - - - 

«ev\{o} 

2  I  „,2 


|  \u  1 1  v  2 

E?;jesr^jk2  +  Uf) 


EjcsT  d\vl 


=  2 


sup 


Ei 


,jeSi  uvvi 


ev\{0}  22iGsldivi 


Eiesl  divi 


=  2  sup  _ 

«ev\{o}  A,ieSi  “iT 


<  2 (d-)-rd+, 


where 


di  ■■=  ^2  u), 
ieW 


ZJ, 


d. |_  :=  max  di . 
ieSi 


d-  :=  mind*. 
ieSt 


Because  ((A,)2X51) .  =  0  if  i  qL  Si,  it  is  straightforward  to  adapt  the  proof  of  Lemma  2.1,  to 
find  (d_)§||(A,)2xsillv,oo  <  ||(A')2XSi ||v- 


Combining  these  results,  the  condition  | 

\—2r 


1,r)i|2  >  ||(A')2xs’illv,oo  is  satisfied  if  dlzr{d!l)z  > 


—  2r  /  r/  \2 


4(d_)  2(d_)  2r(d_)_)2y/vol  Si,  or,  equivalently,  if 

i(£)  (I) 

Using  vol  Si  =  £ie5l  di  >  |Si|dL,  we  can  deduce  the  stronger  sufficient  condition 

2  /  7  \  2  r 


d+ 


di 


-r-  > 


§1  »'■ 
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5  Allen-Cahn  equation  on  graphs 


In  this  section,  we  investigate  the  Allen-Cahn  equation  on  graphs.  A  short  overview  of  the 
continuum  Allen-Cahn  equation  can  be  found  in  Section  A.l  in  Appendix  A. 

We  propose  the  following  Allen-Cahn  equation  (ACE)  on  graphs,  for  all  i  6  V: 


Ui  =  -(A u)i  -\d,  r 
Ui  =  (uo)i 


W'[xii ) 


for  t  >  0, 
at  t  =  0, 


(ACE£ 


for  a  given  initial  condition  uq  £  V  and  £  >  0.  Here  W  G  C2(M)  is  a  double  well  potential.  For 
definiteness  we  set  W  to  be  the  standard  double  well  potential  W{u)  =  (u  +  l)2(u  —  l)2,  hence 
W'(u)  =  4 u(u2  —  1)  and  W  has  two  stable  minima  at  the  wells  at  u  =  ±1  and  an  unstable 
local  maximum  at  u  =  0.  Recall  that  the  sign  convention  for  the  Laplacian  is  opposite  to  the 
one  used  in  the  continuum  literature.  Note  that  for  e  sufficiently  small,  this  system  has  3n 
equilibria,  of  which  2n  are  stable. 

The  (MBOr)  algorithm  is  closely  related  to  time-splitting  methods  applied  to  the  Allen- 
Cahn  evolution  (ACE£).  The  diffusion  step  is  precisely  the  time  evolution  with  respect  to  the 
first  term  of  (ACE£)  and  the  thresholding  step  is  the  asymptotic  behavior  of  evolution  with 
respect  to  the  second  term  of  (ACEe). 

The  case  V  =  hd  with  weights  lo13  =  o;(||i  —  j||)  was  considered  in  [BC99],  where  it  is 
seen  as  an  approximation  to  the  Ising  model,  and  stationary  solutions  and  traveling  waves 
are  constructed  for  £  small  enough.  The  authors  note  that,  when  /jjtJ  corresponds  to  nearest- 
neighbors,  this  equation  is  known  as  the  discrete  Nagumo  equation,  which  is  a  simplified  model 
of  neural  networks.  In  this  context,  [HPS11]  considered  the  Nagumo  equation  in  Z1  and  derived 
the  existence  of  traveling  waves.  In  general,  we  are  not  aware  of  any  previous  works  where 
(ACE£)  is  considered  for  an  arbitrary  weighted  graph  (V,  E,Uij).  It  would  be  interesting  to  see 
whether  the  analysis  in  [BC99]  can  can  be  extended  to  use  (ACE£)  to  study  phase  transitions 
in  general  graphs,  a  topic  of  interest  in  other  areas  of  mathematics  [LyoOO]. 

Just  as  in  the  continuum  case,  we  arrive  at  (ACE£)  as  the  gradient  flow  given  by  the  graph 
Ginzburg-Landau  functional,  GLe(u ):  V  — >  M, 


GLe(u )  :=  l-\\Vu\\2£  +  -£{D-rWou,  l)v, 
where  (D~rW  o  u)i  =  d~rW(ui),  and  whose  first  variation  is  given  by 


(GL£ 


^ GLe(u  +  tv) 


=  (A u,v)v  +  -(D  rW'(u),v)v. 


t= o 


The  factor  d^r  in  the  potential  term  is  needed  to  cancel  the  factor  d\  in  the  V-inner  product. 
Equation  (ACE£)  is  then  the  V-gradient  flow  associated  with  (GL£). 

Recall  that  the  Laplacian  A  also  depends  on  r.  In  fact,  the  equation  in  (ACE£)  can  be 
rewritten  as 

d\iii  =  -  y ^U)jj(uj  -  Uj ) - W'(ui), 

jcv  £ 

showing  that  the  factor  d\  can  be  interpreted  as  a  node-dependent  time  rescaling. 

By  standard  ODE  arguments  and  the  smoothness  of  the  right  hand  side  of  (ACEe),  for  each 
£  >  0  a  unique  C 1  solution  to  (ACE£)  exists  for  all  t  >  0. 

This  continuum  case  (see  Appendix  A.l)  suggests  an  approach  for  finding  a  valid  notion  of 
mean  curvature  (and  its  flow)  for  graphs:  Take  initial  data  u( 0)  =  X5  —  XSC,  for  some  node 
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set  S  C  V,  and  consider  the  corresponding  solution  u£(t)  to  (ACEe),  for  all  times  t  >  0.  The 
question  is  whether  the  limit 

Ui{t)  :=  lim  uf{t) 
e-s>0+ 

exists.  Even  if  it  does,  it  is  unlikely  that  u  is  of  the  form  xs(t)  ~Xs{t)ci  which  can  be  interpreted 
as  a  binary  indicator  function  for  some  evolving  set  S(t),  for  all  times  t  >  0,  but  there  may  be 
an  approximate  phase  separation:  Ui(t)  £  [—  1  —  6,  —  1  +  (5]  U  [1  —  5, 1  +  <5] ,  for  some  small  <5  >  0.  Is 
there  a  way  to  characterize  the  evolution  of  the  “interface”  between  the  two  level  sets  of  tq(t)? 

However,  a  little  analysis  shows  the  above  approach  is  rather  naive.  Indeed,  unlike  in  the 
continuum  case,  the  graph  Laplacian  of  the  indicator  function  of  a  set  S  C  V  is  always  a  well- 
defined  bounded  function  (in  any  norm).  Thus,  for  small  e  the  potential  term  in  the  equation 
will  dominate  the  dynamics,  and  pinning  or  freezing  will  occur,  as  proven  in  Theorem  5.3.  This 
is  the  dynamics  in  which  the  sign  of  the  value  of  u  on  each  node  is  fixed  by  the  sign  of  the  initial 
value,  and  u  at  each  node  just  settles  into  the  corresponding  well  of  W. 

As  discussed  at  the  start  of  Section  3.3,  the  question  how  to  connect  the  sequence  of  sets 
evolving  by  graph  mean  curvature  to  the  super  (or  sub)  level  sets  {i  £  V :  u£{t)  >  0}  for 
solutions  of  (ACEe),  is  still  open.  See  also  Question  7.4. 

Remark  5.1.  Note  that  in  the  (MBOr)  algorithm,  the  values  of  u  are  reinitialized  in  every 
iteration  to  0  or  1.  Our  choice  of  the  double  well  potential  W  in  (ACEe)  has  two  equilibria 
corresponding  to  the  level  sets  for  ±1.  Correspondingly,  the  unstable  equilibrium  for  (MBOr) 
corresponds  to  the  1/2  level  set,  while  for  (ACEe)  it  corresponds  to  the  0  level  set.  This  agrees 
with  the  now  standard  notations  for  Allen-Cahn  and  MBO. 

Below,  we  show  that  for  all  e  below  a  finite  eo  >  0  the  functions  u£(t)  do  not  change  sign  as 
t  varies,  so  that  pinning  occurs.  Recall  that  a  set  which  contains  the  forward  orbit  of  each  of  its 
elements  is  called  positively  invariant ,  and  that  the  number  of  nodes  in  the  graph  G  is  |Vj  =  n. 

Lemma  5.2.  Consider  the  set  S  :=  {u  £  V:  ||u||y  <  ^fndff}  and  let  u(t )  be  the  solution  to 
(ACEe)  for  a  given  e  >  0.  Then  t  H >  ||«(i)||y  is  decreasing  at  each  t  such  that  u(t)  £  Sc.  As  a 
consequence,  the  set  S  is  positively  invariant  and  every  trajectory  of  (ACEe)  enters  S  in  finite 
time. 

Proof.  Define  the  set  Aft)  :=  {i  £  V :  u2(t)  <  2}.  We  compute 

|lH*)llv  =  2  {u(t),u(t))v 

=  — 2||Vu(f)||f  -  f  ^2ui(t)2(Ui(t)2  -  1) 

iev 

=  -2||Vu(t)|||-f  Ui(t)2{ui{t)2  -  i)  +  f  Ui(t)2{l-Ui{t)2) 

i£Ac(t)  ieA(t) 

<  -I  (  E  “‘W2  -  A® 

The  last  inequality  follows,  since  m(t )2  —  1  >  1  for  i  £  Ac(t),  and  max{.x2(l  —  x 2) :  x2  <  2}  = 
Note  that  ||,u(t)||y  <  dr+  Thus,  if  u{t)  £  Sc,  then  JTgy'U^t)2  >  ^fn,  and  hence 

Y  ui(tf  =  YUi(tf  -  J2  Ui{t)2  >  -  4|A(t)|. 

i£Ac(t)  i&V  ieA(t) 

Therefore, 

asMOIIv  <  -f  -  j|A(t)|)  <o, 
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where  we  have  used  that  |A(t)|  <  n.  This  shows  ||u(t)||y  is  decreasing  in  the  region  Sc,  as 
desired.  The  other  statements  in  the  lemma  now  follow.  □ 

Theorem  5.3.  Assume  |u*(0)|  >  0  for  all  i  £  V.  There  exist  an  ep  and  an  eK  (depending  on 
the  spectral  radius  of  A  via  (41),  and  on  supt>0  ||Au(t)||v,oo  <  00  (4®)>  respectively),  such 

that,  if  either  e  <  ep  or  e  <  eK,  then  the  solution  u(t )  to  (ACEe)  is  such  that  sign(ui{t ))  is 
constant  in  time,  for  all  i  £  V. 

Proof.  By  Lemma  5.2,  ||u(t)||y  <  ^ ndff  for  t  large  enough.  Hence,  by  continuity  of  u(t),  there 
is  a  C  (depending  on  the  initial  condition)  such  that,  for  all  t  >  0, 

IK*)||v  <  c. 

Thus,  if  p  >  0  denotes  the  spectral  radius  of  A,  we  get,  for  alii  £  V, 

<|Aui(f)|2  <  ||Au(t)||y  <  p2C2. 


In  particular 

\AUi(t)\<pCd~r\  (40) 

for  all  i  £  V,  thus,  we  have  the  inequalities 

-pCd~r*  -  \dfrW'(ui)  <  in  <  pCdf*  -  jdJrW' (uf). 

Without  loss  of  generality,  we  can  assume  that  there  is  a  number  a  £  (0, 1)  such  that  |itj(0)|  >  a 
for  all  i  £  V.  If  there  is  an  i  £  V  such  that  \ui(t)\  =  a  for  a  given  t,  then  we  have  that 
\W' (ui{t))\  =  4a (1  —  a2),  with  a  sign  opposite  to  that  of  Ui.  Thus,  if 

e  <  £o  :=  C~1p~1 4a (1  —  a2)d+2  <  C~1p~l4a{l  —  a2)di  2 ,  (41) 

then  iii  <  0  if  tq(0)  <  0,  and  iii  >  0  if  vi,;(0)  >  0.  Hence  Ui(t)  can  never  reach  zero,  and  by 
continuity  in  t  it  does  not  change  sign. 

Alternatively,  instead  of  (40),  we  can  estimate  |Auj(t)|  <  supt>0  || Au(t)||v,oo  <  oo.  The 
hnitude  of  the  right  hand  side  follows  from  (40).  Following  the  same  reasoning  as  above,  we 
then  conclude 

eK  :=  Tsup  ||Au(#)||v,ooN)  df_r 4a{\  -  a)2.  (42) 


The  constant  eK  in  Theorem  5.3  involves  ||Au||y,oo>  which  is  “curvature-like”.  Compare  this 
to  the  constant  rK  in  Theorem  4.2,  which  depends  on  the  maximum  curvature  of  the  indicator 
set  in  the  graph,  || A\'s'||yi00,  as  also  discussed  in  Section  4.4.  This  tentative  similarity  makes 
us  suspect,  that  a  condition  on  the  local  curvature,  similar  to  those  for  the  (MBOT)  algorithm 
given  in  Theorem  4.8,  guarantees  a  phase  change  in  the  Allen-Cahn  flow.  We  discuss  this  further 
in  Question  7.3. 

We  see  that  the  discrete  nature  of  the  graph,  manifest  in  the  finite  spectral  radius  of  the 
Laplacian,  makes  the  limit  behavior  of  (ACEe)  as  e  — >  0  much  different  than  that  for  the 
continuum  case.  In  particular,  this  means  that  we  ought  to  look  for  a  notion  of  mean  curvature 
flow  on  graphs  more  carefully. 
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Remark  5.4.  For  e  small  enough,  but  not  smaller  than  the  £q  from  Theorem  5.3  above,  we 
expect  interesting  asymptotic  behavior  for  the  motion  of  the  phases  in  (ACE£)  on  intermediate 
time  scales.  Such  asymptotics  might  be  connected  to  the  graph  curvature  of  the  phases,  which 
would  match  the  situation  in  the  continuum  setting,  where  the  solution  has  phases  that  for  large 
times  behave  as  if  they  were  evolving  by  mean  curvature  flow,  while  the  solution  itself  becomes 
stationary  in  the  limit  t  — >  +oo.  This  phenomenon  is  known  as  dynamic  metastability  (see  for 
instance  [BK91]  and  the  references  therein).  See  also  Question  7.4. 


6  Explicit  and  computational  examples 

In  this  section  we  give  several  examples  of  graphs  where  the  mean  curvature,  MBO,  and  Allen 
Cahn  evolutions  can  be  compared  either  explicitly  or  computationally. 


6.1  Complete  graph 

Consider  the  complete  graph,  I\n ,  on  n  nodes  with  Uij  =  u ;  for  all  i,  j  e  V.  See  Figure  la.  In 
this  case,  the  matrix  representation  of  the  graph  Laplacian  is  given  by  the  circulant  matrix, 

L  =  u[(n  -  l)uj]~r  (n  Idn  -  lnl(J  , 


where  ln  denotes  the  vector  in  with  all  entries  equal  to  1,  and  Idn  the  identity  matrix  in 
Mnxn.  The  eigenvalues  of  L  are  given  by  0  and  c jn[(n  —  l)w]-r  (with  multiplicity  n  —  1).  In 
particular,  A2  =  An  =  p.  Note  that  the  normalized  eigenvector  corresponding  to  eigenvalue  0  is 
given  by  (vol  V)~^xv- 

Let  S  C  V  be  a  set  with  volume  ratio  Rs  =  ^ °j  y  (see  also  Theorem  4.3).  Using  the  spectral 
decomposition  from  (11),  the  evolution  of  xs  by  the  heat  equation  can  be  explicitly  written  as 

e~tAXS  =  RsXv  +  e~pt  (xs  ~  RsXv)  ■ 


Assume  Rs  /  5.  Then  there  exists  a  critical  time  step  rc  depending  only  on  vol  S,  vol  V, 
and  p  such  that  r  <  rc  implies  the  solution  to  the  (MBOT)  evolution  is  pinned  and  r  >  rc 
implies  exactly  one  iteration  of  the  (MBOT)  evolution  gives  a  stationary  solution,  either  0  or 
Xv  depending  on  the  initial  mass,  M(xs)  =  vol  S  (see  (10)).  From  the  solution,  the  critical 
time  step  rc  can  be  directly  computed, 


tc 


-log 

P 


max{Rs,  1  —  Rs} 

\l-Rs\ 


If  Rs  =  symmetry  pins  the  (MBOr)  evolution  for  all  r  >  0. 

The  bound  from  Theorem  4.2  states  that  pinning  occurs  if  r  <  Tp  =  p_1  log  ^1  +  ^(nRs)~ , 
where  we  have  used  the  fact  that,  for  all  i  €  V,  d}  =  The  bound  in  Theorem  4.3  states 


that  trivial  dynamics  occur  if  r  >  =  p  1  log 


n  2  Rg  RgC 

I  \  —Rs  I 


.  Note  that  for  n  >  2,  Rs  >  ^  and 


Tt>  Tc>  Tp. 

By  symmetry,  both  the  Allen-Cahn  equations  and  mean  curvature  flows  reduce  to  two- 
dimensional  systems,  with  one  variable  governing  the  value  of  the  nodes  in  S  and  the  other  the 
nodes  in  Sc.  Critical  parameters  e  and  9t  exist  for  which  below  the  phase  remains  the  same  for 
all  nodes  and  above  the  phase  simultaneously  changes. 
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Figure  1:  Some  small  graphs,  discussed  in  the  examples  of  Section  6.  (a)  The  complete  graph 
/\4,  (b)  the  star  graph  SG&,  (c)  a  small  grid,  and  (d)  a  cycle  graph  C4;  see  Sections  6.1,  6.2, 
6.4,  and  6.5. 


6.2  Star  graph 

Consider  a  star  graph  SGn  as  in  Figure  lb  with  n  >  3  nodes.  Here  the  central  node  (say  node 
1)  is  connected  to  all  other  nodes  and  the  other  n  —  1  nodes  are  only  connected  to  the  central 
node.  Hence,  for  all  i  e  {2, . . . ,  n},  uiu  =  un  >  0,  and  all  the  other  Ujk  are  zero. 

We  consider  the  unnormalized  graph  Laplacian  L  =  D  —  A  (r  =  0  in  (8)).  Since  d\  = 
J2j= 2  wij  and  di  =  uju,  for  i  6  {2, . . . ,  n},  we  can  explicitly  compute  the  characteristic  polyno¬ 
mial  of  L: 

n  n  n 

p(\)  =  (  -  A  +  Y^  Wi k)  -  A)  -  n  -  A). 

k= 2  3=2  k= 2  j>  2,  j/fc 

If  all  non-zero  edge  weights  have  the  same  value  ui,  this  simplifies  considerably  to 

p(  A)  =  ((n  —  1  )u  —  A)  {oj  —  A)"”1  —  w2(n  —  l)(w  —  A)n_2. 

Hence,  in  this  case,  the  eigenvalues  are  Ai  =0,  A*  =  ui  for  i  G  {2, . . . ,  n  —  1},  and  An  =  nu.  A 
choice  of  corresponding  (normalized)  eigenvectors  {u*}”=1  is  given  by15 


v  =  n 


’-XV, 


v)  =  2-^ 


1 

-1 

0 


if  j  =  i, 

if  j  =  i  +  1,  for  i  e  {2, . . . ,  n  —  1} 
else, 


;n  =  1  ( n  -  1  if  3  = 

\/ n(n  -  1)  [-1  if  J  7^  !• 


We  now  let  S  =  {1}  and  note  that  x.S  has  the  explicit  expansion  in  terms  of  these  eigenvec¬ 
tors, 

_  1  1  1  _  1 

Xs  =  n  2v  +  (n  —  1)271  2V  . 

We  now  consider  the  (MBOr)  iterates  of  xs-  We  compute 

e-Arxs  =  n-'xv  +  (n  -  1  )^^(e-“T)rn. 

Thus  pinning  occurs  if  r  <  rc  :=  log  (2^55^  .  If  r  >  rc,  the  solution  to  the  (MBOr)  evolution 
gives  the  stationary  solution,  0,  after  exactly  one  iteration.  The  bound  from  Theorem  4.2  states 
that  pinning  occurs  if  r  <  ^log|.  The  bound  in  Theorem  4.3  states  that  trivial  dynamics 

15Here,  subscripts  j  denote  the  components  of  the  vectors. 
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occur  if  r  >  ^  log  (2^^  j  ■  Qualitatively,  this  example  shows  that  it  is  easier  for  a  solution  to 
be  pinned  on  nodes  with  smaller  degree. 

We  now  consider  an  implication  of  Theorem  4.8  in  the  case  where  the  graph  induced  by  S\ 
is  a  star  graph  with  node  1  as  center,  i.e .,  d (  =  0  for  all  i  £  S\.  This  is  certainly  true  for  the 
case  when  the  graph  is  a  star  graph  and  S  =  {2, . . .  ,  n}.  The  following  lemma  states,  in  the 
case  where  r  =  1,  a  simple  criterion  on  the  degrees  for  which  there  exists  a  r  interval  in  which 
node  1  switches  phase  in  a  single  iteration  of  (MBOr).  We  will  see  an  application  of  the  lemma 
in  Section  6.3. 


Lemma  6.1.  Let  r  =  1  and  consider  the  case  where  the  subgraph  induced  by  S i  is  a  star  graph 
with  node  1  as  center,  i.e.,  d!i  =  0  for  all  i  £  Si.  If  either 

•  S']  |  =  1  and  di  <  d\,  for  i  £  Si,  or 

•  | Si |  >  2  and  di  <  d\,  for  all  i  £  Si, 

then  there  exists  a  nontrivial  r  interval  (as  in  Theorem  f.8)  such  that  node  1  will  change  phase 
in  the  next  (MBOr)  iterate. 


Proof.  We  see  from  condition  (39)  (or  via  direct  computation  from  (38))  that  a  sufficient  con¬ 
dition  for  n  <  72,  is  to  have,  for  all  i  6  Si,  — ^  or  equivalently  — <  — 1 -.  If  i  £  S\  is  the 


di  d\ 


di 


only  node  in  Si,  then  con  =  d\  and  we  find  the  condition  di  <  di.  If  however  |<5i|  >  2,  we  have 


f  ^ — *\  j 

— j-  <  1,  because  dx  =  uin  +  2_^  uiji  and  thus  the  condition  on  -f-  can  be  replaced  by  the 


di 


simpler  (but  stronger)  condition  di  <  di,  for  all  i  6  Si. 


□ 


6.3  A  regular  tree 

We  consider  the  (MBOr)  iterations  on  a  regular  tree  as  in  Figure  2.  Let  uiij  =  u,  for  all 
( i,j )  £  E ,  and  r  =  1.  As  in  Figure  2a,  we  consider  the  case  where  the  initial  set  S  consists  of 
the  leaves  of  a  branch.  We  first  observe  that  the  subgraph  induced  by  Sj,  for  any  j  £  V,  is  a 
star  graph  with  node  j  as  center,  i.e.,  d(  =  0  for  all  i  £  Sj  (for  an  example  of  a  star  graph  with 
five  nodes,  see  Figure  lb),  so  that  the  hypothesis  of  Lemma  6.1  are  satisfied  with  nodes  9  and 
10  each  playing  the  role  of  “node  1”  in  the  lemma. 

Applying  Lemma  6.1  to  node  9  in  Figure  2a  where  S  =  {1,2, 3, 4},  we  see  that  there  exists 
a  t  such  that  node  9  will  change  in  the  next  iteration.  By  symmetry,  node  10  will  change  in  the 
same  iteration.  If  node  13  doesn’t  change  in  the  first  MBO  iteration,  Lemma  6.1  can  be  applied 
again  (because  node  13  has  two  children,  the  hypotheses  of  the  lemma  are  again  satisfied  with 
9, 10  £  S 13)  to  show  that  there  exists  a  r  such  that  node  13  will  be  added  to  the  set.  After 
node  13  has  been  added  to  the  set,  S,  as  in  Figure  2b  the  MBO  iterates  are  stationary.  To 
see  that  node  15  cannot  be  added  to  S,  assume  that  it  were.  Then  the  value  of  the  Lyapunov 
functional,  (33),  must  have  decreased.  But  by  symmetry,  in  the  next  MBO  iteration,  node  15 
will  be  removed  from  S,  again  decreasing  the  value  of  the  Lyapunov  functional,  a  contradiction. 
The  final  configuration  in  Figure  2b  minimizes  the  normalized  cut,  as  defined  in  Section  2.2. 

This  argument  is  easily  generalized  to  trees  where  each  node,  excluding  leaves,  has  the  same 
number  of  children  c  >  2. 
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(a)  Initial  configuration 


Figure  2:  The  initial  and  final  configurations  for  an  evolution  by  the  (MBOr)  scheme  on  a  tree 
graph;  see  Section  6.3. 


6.4  A  small  square  grid 

Here  we  construct  an  explicit  example  where  Theorem  4.8  can  be  applied  to  show  that  there 
exists  a  time  interval  (71,72)  such  that  a  node  is  guaranteed  to  change  in  one  iteration  of  the 
(MBOr)  algorithm. 

Consider  a  three  by  three  (nonperiodic)  square  grid  as  in  Figure  lc  with  unit  edge  weights, 
with  nodes  numbered  1  through  9  from  left  to  right,  top  to  bottom.  Let  S  =  {4,  6,  7, 8,9}.  We 
focus  on  node  5.  We  have  A/5  =  {2, 4, 6, 8}  and  S$  =  A/5  n  S  =  {4,6,8}.  We  then  compute 
d'5  =  3,  c/5  =  4,  d'A  =  dg  =  d'8  =  0  and  d±  =  de  =  ds  =  3.  It  is  easily  checked  that,  for  i  £  S5, 


4 

c/5 


and 


i5  1  3  dg 

di  3  4  ds  ’ 


such  that  conditions  (39)  are  satisfied.  Furthermore,  with  S5  :=  S5  U  {5}, 


E 

kes5 


dfc  W ik 

dk  di 


0  if  i  =  5, 

1  if*€55, 


thus  (S)  _  maxie55  EfeeSg  % =  (f)2  -  3  =  w  >  °-  and  even  the  ful1  condition  (38),  for 
r  =  1,  is  satisfied.  From  (36)  we  can  then  compute 


3  ±  V5, 


for  the  time  interval  (ti,T2)  of  Theorem  4.8. 


36 


(a)  (b)  (c) 


Figure  3:  Two  (MBOr)  evolutions  on  the  2-torus  graph,  T|212.  The  top  and  bottom  ‘border’ 
nodes  are  connected  (not  shown)  as  are  the  left  and  right  ‘border’  nodes,  (a)  Initial  condition, 
(b)  For  t  =  1.12,  the  stationary  state  shown  is  reached  in  4  iterations,  (c)  For  t  =  4,  the 
stationary  state  shown  is  reached  in  5  iterations.  See  Section  6.5. 


6.5  Torus  graph 

Consider  the  n-cycle,  Cn  with  n  nodes.  The  nodes  are  arranged  in  a  circle  and  each  node  is 
connected  to  its  2  neighbors.  We  take  Uij  =  u  for  i  ~  j  and  zero  otherwise.  See  Figure  Id. 
We  consider  the  unnormalized  graph  Laplacian  L  =  D  —  A  (r  =  0  in  (8)).  In  this  case,  L  is  a 
circulant  matrix  diag({— 1,  2,  —1},  {—1,  0, 1}).  The  eigenpairs  {(Aj,  uJ)}j=1  are  given  by 

x  o  o  27t(j  —  1) 

An  =  Zuj  —  Zlo  cos - 

n 

vj  =  exp  (2m(j  —  1  )/n) . 

We  then  consider  the  2-torus  graph,  n2  which  is  the  Kronecker  (tensor)  product  of  the 
ni-  and  recycles.  See  Figure  3.  In  particular,  if  u  and  v  are  eigenfunctions  of  the  graph 
Laplacian  on  Cni  and  Cn2  with  corresponding  eigenvalues  a  and  f3  respectively,  then  w  =  u®v 
(with  u>ij  =  UiVj)  is  an  eigenvector  of  T%  no  with  corresponding  eigenvalue  a  +  (3.  In  particular, 
the  spectral  radius  of  the  Laplacian  is  p  =  8u. 

Consider  for  a  moment  T%  no  as  a  discretization  of  the  torus,  T2.  The  nontrivial  minimal- 
perimeter  subsets  of  T2  are  given  by  “strips”.  Thus  we  might  expect  that  for  some  initial 
condition,  xs,  S  C  V  the  evolution  by  MBO,  Allen-Cahn,  or  MC  would  converge  to  a  strip. 

We  consider  the  (MBOr)  evolution  on  a  32  x  12  torus  with  u  =  1  and  initial  condition, 
as  in  Figure  3a.  For  r  =  1.12,  the  solution  is  stationary  after  4  iterations  once  the  “high 
curvature  corners”  have  been  removed,  as  in  Figure  3b.  For  r  =  4,  the  solution  evolves  into  a 
minimal-perimeter  “strip”  in  5  iterations,  as  in  Figure  3c. 

For  the  parameters  in  Figure  3b,  we  compute  the  guaranteed  stationarity  bounds  in  (30) 
and  (31)  to  be  tp  «  0.0057  and  tk  =  respectively,  showing  these  bounds  are  not  sharp,  figs 
Consider  (MCFgf),  with  Sn  equal  to  the  minimal-perimeter  strip  in  Figure  3c.  Then  Sn+i  =  Sn 
is  a  minimizer  of  Sn),  but  so  are  Sn+ 1  =  Sn  U  d(S%)  and  Sn+i  =  Sn  \  dSn  (or  variations  in 
which  only  one  ‘vertical  line’  of  the  boundary  is  added  or  removed).  This  illustrates  a  possible 
type  of  non-uniqueness  for  (MCFgt),  which  occurs  when  Sn  is  totally  geodesic  (i.e.,  its  boundary 
is  a  geodesic).  To  reiterate,  the  stationary  solution  in  Figure  3b  is  frozen  (due  to  the  smallness 
of  r),  while  the  solution  in  Figure  3c  is  totally  geodesic. 

6.6  Buckyball  graph 

Consider  the  buckyball  graph  with  60  nodes  and  90  edges  with  u>ij  =  co  for  all  edges  ( i,j )  as  in 
Figure  4.  The  graph  is  regular;  each  node  has  degree  3 u. 
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Figure  4:  An  (MBOT)  evolution  with  r  =  2  on  the  buckyball  graph.  The  solution  at  each 
iteration  is  the  characteristic  function  of  the  gray  nodeset.  See  Section  6.6. 


Consider  for  a  moment  the  buckyball  graph  as  a  (coarse)  discretization  of  the  sphere,  S2. 
There  are  no  nontrivial  minimal-perimeter  subsets  of  §2.  Great  circles  are  the  only  nontrivial 
stationary  submanifolds  of  §2  (and  have  constant  curvature).  In  fact,  great  circles  are  totally 
geodesic.  Thus  we  might  expect  that  for  any  initial  condition,  xs ,  S  C  V  such  that  IS)  /  |S2 1 /2, 
the  evolution  by  MBO,  Allen-Cahn,  or  MC  would  converge  to  a  stationary  solution,  either  0  or 
Xv  depending  on  the  initial  mass,  M(xs )  =  vol  If  S  is  chosen  to  be  a  symmetric  partitioning 
of  the  nodes  for  the  buckyball  graph,  we  expect  that  the  (MBOr)  evolution  will  be  stationary 
for  all  values  of  r. 

The  bound  from  Theorem  4.2  states  that  pinning  in  (MBOT)  occurs  if 

r  <  1  log  ^1  +  . 
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The  bound  in  Theorem  4.3  states  that  trivial  dynamics  occur  if 


r  >  A2  1  log 


/(3u>)S|S|(n-|.S|)\ 

1  IISI-ll  )' 


We  find  numerically  that  A2  ~  w1_r 3~r  ■  0.2434  and  An  «  w1-r3-r  •  5.6180. 

For  initial  condition  xs,  with  \S\  =  14,  as  given  in  Figure  4  (top  left),  and  r  =  0  and  u  =  1, 
Theorems  4.2  and  4.3  predict  that  pinning  occurs  if  r  <  0.0223  and  trivial  dynamics  occur  if 
r  >  15.1811.  We  find  numerically  that  this  initial  condition  is  pinned  if  r  <  1.89  and  trivial 
dynamics  occur  if  r  >  3.54.  For  intermediate  values  of  r,  the  iterates  shrink  to  the  empty  node 
set.  For  r  =  2,  the  iterates  take  3  iterations  to  reach  steady  state,  as  illustrated  in  Figure  4. 

For  the  initial  condition  \s  where  S  is  taken  to  be  a  symmetric  partitioning  of  the  nodes, 
(MBOr)  evolution  is  pinned  for  all  values  of  r. 


6.7  Adjoining  regular  lattices 

We  consider  the  graph  which  is  composed  by  adjoining  a  square  and  triangular  lattice.  See 
Figure  5.  We  take  r  =  0  and  Wy  =  1  for  i  ~  j  and  zero  otherwise.  Note  that  the  degree  of  a 
node  in  the  triangular  lattice  is  6  and  the  degree  of  a  node  in  the  square  lattice  is  4. 

To  test  the  intuition  from  the  star  graph  (see  Section  6.2)  that  it  is  easier  for  the  solution 
to  pin  on  nodes  with  smaller  degree,  we  consider  the  initial  condition  given  in  the  top  left  panel 
of  Figure  5.  The  mass  is  initially  distributed  over  both  the  square  and  triangular  lattice  sites. 
We  consider  the  (MBOr)  evolution  with  r  =  0.8.  The  solution  moves  freely  on  the  lattice  sites 
with  degree  >  4,  i.e.,  on  the  triangular  lattice.  However,  on  the  square  lattice,  the  solution  only 
‘rounds  corners’. 

The  nodes  on  the  ‘border’  of  the  graph  (where  the  regular  lattice  was  cut)  have  smaller 
degree.  In  Figure  6,  we  demonstrate  that  the  solution  can  also  be  pinned  on  the  border.  Again, 
the  initial  condition  is  given  in  the  top  left  panel.  In  this  simulation,  we  take  r  =  0.9.  Away 
from  the  boundary,  the  solution  set  can  again  shrink  freely.  However,  the  solution  becomes 
pinned  on  the  border. 


6.8  Two  moons  graph 

In  this  last  example,  we  consider  a  graph  which  is  widely  used  as  a  benchmark  problem  for 
partitioning  algorithms.  Our  construction  of  the  graph  follows  [BH09].  The  graph  is  generated 
by  first  randomly  distributing  600  points  in  a  region  described  by  two  half  arcs  in  M2 — referred 
to  as  “two  moons” .  See  Figure  7  (top  left).  The  points  are  then  embedded  in  M100  and  randomly 
perturbed  by  i.i.d.  Gaussian  noise  with  mean  zero  and  standard  deviation  a  =  0.1.  Let  k  =  10. 
The  edge  weights  are  chosen  to  be 

4  11  1 1  o 

- ^2  | \Xi~Xj  ||z 

Wij  =  max{si(j),  Sj(»)},  where  Si(j)  =  e  , 

and  di  is  the  Euclidean  distance  between  Xi  and  its  k- th  nearest  neighbor.  We  then  take  the 
symmetrized  A;- nearest  neighbors  graph.  This  is  given  in  Figure  7  (top  right). 

We  consider  the  (MBOr)  evolution  with  t  =  5  and  initial  condition  as  shown  in  Figure  7 
(bottom  left).  After  9  iterations,  the  (MBOr)  evolution  converges  to  the  state  in  Figure  7 
(bottom  right). 

We  want  to  stress  that  the  two  moons  example  is  meant  as  an  illustration  of  the  (MBOr) 
algorithm  on  a  more  complex  toy  graph.  In  this  paper  we  do  not  aim  to  compete  in  terms  of 
accuracy  or  efficiency  with  existing  clustering  methods,  hence  we  will  not  focus  on  those  aspects 
of  the  two  moons  example. 
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Figure  5:  An  (MBOr)  evolution  with  r  =  0.8  on  a  graph  consisting  of  adjoining  regular  lattices. 
The  solution  at  chosen  iterations  is  the  characteristic  function  of  the  gray  nodeset.  For  the 
initial  condition,  given  by  the  top  left  panel,  the  evolution  reaches  a  steady  state  in  9  itera¬ 
tions.  Iterations  3  (top  right),  6  (bottom  left),  and  9  (bottom  right)  are  shown.  This  example 
strengthens  the  ‘rule  of  thumb’  that  it  is  easier  for  a  solution  to  become  pinned  on  nodes  with 
smaller  degree.  See  Section  6.7. 


7  Discussion  and  open  questions 

Motivated  by  curvature  flows  in  continuum  mechanics,  we  described  several  analogous  processes 
on  graphs.  In  particular  we  used  the  graph  total  variation,  or  graph  cut,  to  define  curvature  on 
graphs,  which  we  then  related  to  the  graph  Allen-Cahn  equation,  graph  MBO  scheme,  and  graph 
mean  curvature  flow.  The  continuum  intuition  for  these  problems  suggests  many  results,  some 
of  which  we  proved  in  this  paper,  some  which  we  have  shown  cannot  hold  on  a  graph  because  of 
the  lack  of  infinitesimal  length  scales,  and  some  which  we  state  below  as,  still  unproven,  open 
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Figure  6:  An  (MBOr)  evolution  with  r  =  0.9  on  a  graph  consisting  of  adjoining  regular  lattices. 
The  solution  at  chosen  iterations  is  the  characteristic  function  of  the  gray  nodeset.  For  the  initial 
condition,  given  by  the  top  left  panel,  the  evolution  reaches  a  steady  state  in  13  iterations. 
Iterations  4  (top  right),  9  (bottom  left),  and  13  (bottom  right)  are  shown.  This  example 
strengthens  the  ‘rule  of  thumb’  that  it  is  easier  for  a  solution  to  become  pinned  on  nodes  with 
smaller  degree.  See  Section  6.7. 


questions. 

In  a  sense  to  be  made  precise,  for  a  suitable  choice  of  r  (not  too  small,  not  too  large, 
depending  on  u,  most  likely  depending  on  the  graph’s  spectrum),  the  dynamics  of  (MBOr)  are 
expected  to  approximate  those  of  graph  MCF. 

Question  7.1  (MBO  and  graph  Mean  Curvature  Flow).  Is  there  an  interval  of  r  (depending 
on  (5t),  for  which  a  single  (MBOT)  iteration  minimizes  the  (MCF$t)  functional  IF  from  (25)? 
For  such  a  t,  the  graph  mean  curvature  flow  (MCF%t)  would  coincide  with  the  (MBOr)  scheme 
(up  to  a  time  rescaling). 
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Figure  7:  (top)  Construction  of  the  two  moons  graph,  (top  left)  Some  random  points  in  the 
shape  of  two  moons,  (top  right)  The  connectivity  of  the  graph  resulting  from  connecting 
nearest  neighbors  after  adding  high  dimensional  noise,  (bottom)  An  (MBOr)  evolution  for 
r  =  5,  starting  with  initial  condition  on  the  left  and  terminating  at  the  stationary  solution  on 
the  right  in  9  iterations.  See  Section  6.8. 


An  approach  to  Question  7.1,  uses  the  Taylor  series  expansion  for  the  solution  of  the  graph 
heat  equation:  e~tAxs  =  Yl'kLo  W.  (— xs,  for  S  C  V.  Hence,  we  can  rewrite  the  Lyapunov 
functional  J  from  (33)  as 

J(xs)  =  (1  -  XS,XS  -  tAxs)v  +  Rs(t),  where  Rs(t)  :=  ^  7  (xs^Akxs)v- 

k= 2 

Using  (1  -  xs,Xs)v  =  0,  (1,  A xs)v  =  0,  and  (3),  we  find 

J(xs)=rTVi(Xs)  +  JR5(r). 

This  connection  between  the  Lyapunov  functional  J  and  the  total  variation,  and  hence  the 
MCF  functional  T  from  (25),  strengthens  the  plausibility  of  a  positive  answer  to  Question  7.1. 
A  more  difficult  question,  which  could  be  of  great  use  in  numerical  problems,  is  how  we  can 
estimate  the  number  iterations  of  (MBOr)  needed  to  go  from  some  initial  data  to  a  minimizer 
of  the  Ginzburg-Landau  functional  or  graph  cut  functional. 
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Question  7.2  (Minimizing  graph  cut).  For  any,  a  priori  specified,  approximation  error,  is 
there  a  local,  quantitative  bound  on  the  number  of  iterations  of  (MBOT)  needed  to  approximate 
a  minimizer  of  the  graph  cut  functional  TV^  up  to  the  specified  accuracy?  “Local”  means  here 
that  the  bound  does  not  rely  on  the  spectrum  of  the  graph,  but  instead  uses  quantities  such  as 
graph  curvature  nq’r  or  the  total  variation  TV^{xn)  for  some  local  graph  neighborhood  N  C  V . 
The  analogous  question  can  be  asked  for  (MCF§t). 

In  Theorem  4.8  it  was  shown  that,  if  the  curvature  at  a  given  node  is  sufficiently  large  and 
the  time  step  r  in  (MBOr)  is  chosen  in  the  right  interval,  then  the  value  at  the  node  will  change 
in  one  (MBOr)  iteration.  The  next  question  is  the  analogous  statement  for  the  Allen-Cahn 
equation,  (ACEe). 

Question  7.3  (Non-freezing  for  Allen-Cahn).  Let  e  be  in  some  positive  interval  and  let  u£  be 
a  solution  of  the  Allen-Cahn  equation  (ACES)  for  this  choice  of  e.  Suppose  that  the  curvature 
{n1g7')i  of  S  =  {j  |  u£j(to)  <  0}  at  a  node  i  E  S,  is  sufficiently  large  (possibly  depending  on  e). 
Is  there  is  some  interval  of  positive  times  such  that  uf(to  +  h)  >  0  for  h  in  this  interval? 

Because  (ACEe)  is  derived  from  the  graph  functional  (GLe),  we  suspect  that  the  correct 
curvature  in  Question  7.3  is  Kgr ,  the  curvature  related  to  the  anisotropic  functional  ^TV*, 
which  was  identified  as  the  T-limit  of  (GLe)  for  e  — >  0  in  earlier  work  [vGB12],  and  not  the 
curvature  which  can  be  derived  from  the  isotropic  total  variation  functional  TV  as  the  continuum 
case  might  suggest  at  first  glance.  Since  we  have  seen  that  pinning  occurs  for  small  enough  e, 
full  convergence  is  not  expected  here,  but  the  numerical  examples  of  Section  6  suggest  an 
approximate  result  for  small  e  is  feasible. 

Question  7.4  (Allen-Cahn  and  graph  Mean  Curvature  Flow).  Is  there  an  e  >  0  such  that, 
given  the  solution  u£  to  (ACES)  for  some  e  >  0,  there  is  an  increasing  sequence  of  times  tn  for 
which  either  the  sets  Sn  :=  {j  \  u){tn )  <  0}  or  the  sets  Sn  :=  { j  \  Uj(tn )  >  0}  form  a  solution 
to  the  graph  MCF? 

Furthermore,  among  sequences  with  this  property  is  there  exactly  one  sequence  {tn}  that 
is  maximal  in  the  following  sense:  there  exists  no  sequence  {t’n},  of  which  {tn}  is  a  strict 
subsequence,  such  that  {St'n}n  ft-  {<S 'tn}n  and  is  still  a  solution  to  the  graph  MCF? 

A  different  question  is  how  the  graph  MCF  behaves  in  the  continuum  limit,  when  it  is 
formulated  on  a  sequence  of  graphs  which  are  ever  finer  discretizations  of  some  continuum  space. 
We  expect  that  it  should  give  back  the  usual  MCF  in  the  continuum  limit,  or  some  anisotropic 
MCF,  as  the  convergence  results  in  [vGB12]  show  the  final  limit  could  crucially  depend  on  the 
scaling  in  St  and  the  discretization  parameter  (which  will  show  up  in  the  graph  weights).  This 
question  is  similar  (and  perhaps  equivalent)  to  the  convergence  of  discretization  schemes  for  the 
usual  MCF.  Similar  questions  can  be  asked  about  the  graph  Allen-Cahn  equation  and  graph 
MBO  scheme. 

Question  7.5  (Stability  of  graph  MCF,  MBO,  and  ACE,  in  the  continuum  limit).  Suppose  we 
are  given  any  sequence  of  graphs  (Vk,LO^),  k  E  N,  converging  in  the  Gromov-Hausdorff  sense  to 
a  Riemannian  manifold  ( M,g ).  Is  there  a  fixed  time  interval  such  that,  as  k  — >■  oo,  any  sequence 
generated  by  (MBOr)  with  r  in  this  interval  converges  to  a  sequence  generated  by  the  (possibly 
anisotropic)  continuum  MBO  algorithm  in  M  (with  the  Laplacian  induced  by  g)?  Accordingly, 
do  solutions  of  (ACEe)  converge  to  solutions  of  the  (possibly  anisotropic)  continuum  Allen-Cahn 
equation  in  M,  and  do  solutions  to  (MCF§t)  converge  to  viscosity  solutions  (via  the  level  set 
formulation)  of  (possibly  anisotropic)  continuum  MCF  in  M,  with  initial  data  given  by  the  limit 
of  the  initial  data  in  each  Vn  ? 
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As  explained  in  Appendix  A,  MCF  is  closely  related  to  certain  models  of  continuum  phase 
transitions,  particularly  Allen-Calm  and  Ginzburg-Landau  dynamics.  However,  another  impor¬ 
tant  connection  with  statistical  mechanics  involves  Ising  models  and  other  interacting  particle 
systems,  which  are  known  to  converge  in  the  mesoscopic  limit  to  flow  by  mean  curvature.  In 
work  of  Katsoulakis  and  Souganidis  [KS94,  KS97]  convergence  to  a  viscosity  solution  of  MCF 
is  first  proved.  See  also  the  related  work  of  Funaki  and  Spohn  [FS97]  where  MCF  is  derived 
as  a  deterministic  limit  of  stochastic  Ginzburg-Landau  dynamics.  On  the  other  hand,  there  is 
vast  literature  concerned  with  (for  instance)  the  Ising  model  (and  its  generalizations)  on  graphs 
[Lyo89,  LyoOO],  see  also  Durrett’s  book  [Dur07].  This  suggests  the  following  question. 

Question  7.6  (Possible  probabilistic  interpretations  of  graph  MCF,  MBO,  and  AC).  Is  (MCF^t) 
related  to  an  interacting  particle  system  on  the  underlying  graph?  Also,  are  there  interacting 
particle  systems  or  stochastic  processes  in  V  that  are  related  to  (MBOT)  or  (ACEe)? 

Finding  such  a  system  would  partly  resolve  the  issue  that  a  front  moving  on  a  graph  in 
continuum  time  necessarily  does  so  in  a  way  that,  from  a  continuum  point  of  view,  looks 
discontinuous  (as  discussed  previously  in  this  paper,  in  particular  in  Sections  3.3  and  5.),  as 
the  particle  dynamics  would  be  continuous  in  time  and  stochastic  The  convergence  results  in 
[KS94,  KS97,  FS97]  show  that  the  above  question  has  an  a  priori  higher  chance  of  having  a 
positive  answer  for  a  large  graph,  as  it  already  holds  in  the  continuum  limit. 
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A  The  continuum  case 

In  this  appendix,  we  briefly  review  and  provide  references  for  the  Allen-Cahn  equation,  the 
MBO  algorithm,  and  mean  curvature  flow  in  the  continuum  setting. 

A.l  The  continuum  Allen-Cahn  equation 

The  Allen-Cahn  equation  is  a  reaction-diffusion  equation,  given  by 

ut  =  Au  +  f(u),  (43) 

where  u :  Mn  X  M+  — >  M  and  A  is  the  standard  Laplacian  (although  other  linear  elliptic  operators 
can  be  considered  as  well),  and  /  is  a  non-linear  function  of  the  form  /  =  —  W'  where  W :  M  — »  R 
is  a  double  well  potential  with  two  global  minima.  For  simplicity,  take  W (u)  =  (u  +  l)2(tt  —  l)2, 
where  the  minima  are  at  ±1. 

A  question  which  is  always  of  interest  is  understanding  the  way  that  solutions  to  (43) 
converge  to  equilibrium.  For  each  fixed  x,  one  expects  that  u(x,t )  approaches  either  1  or  —  1, 
as  t  — >  Too,  as  these  values  correspond  to  the  minima  of  W.  This  indicates  that  for  very  large 
t  the  function  u  defines  two  regions  of  Mn,  where  it  is  very  close  to  either  1  or  to  —1,  separated 
with  a  smooth  transition  layer  in  between. 
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This  asymptotic  behavior  is  well  understood  nowadays.  Rescaling  (x,  t)  as  (|,  -4),  we  obtain 
the  equation 

u\  =  A  u£  +  e~2  f(u£).  (44) 


Note  that  for  very  small  e  the  function  u£  describes  the  long  time  behavior  of  the  original 
u.  Then,  it  is  well  known  (see  [BSS93,  BK91]  for  background  and  discussion)  that,  as  e  — >  0+, 
the  solutions  ue(x,  t)  converge  to  a  function  which  takes  the  value  —1  in  some  set  St  (depending 
on  time)  and  takes  the  value  1  in  S£.  Here  St  is  a  set  whose  boundary  is  evolving  by  mean 
curvature  flow  (see  Section  A. 2). 

Although  the  original  motivation  for  studying  (43)  was  phase  transitions,  it  is  also  the 
gradient  flow  of  the  Ginzburg-Landau  functional.  Precisely,  equation  (44)  is  the  L2  gradient 
flow  of  the  functional 


CL 


(u)  :=  j  ^|Vu|2  +  -W(u)  dx. 


(45) 


It  is  expected  that  solutions  to  (44)  converge  to  a  local  minimum  of  this  functional,  as  t  +oo, 
thus  schemes  for  (44)  could  be  used  for  approximating  minima  of  (45).  This  is  the  application 
that  serves  as  the  biggest  motivation  in  the  graph  setting. 

For  more  information  about  reaction-diffusion  equations  with  a  polynomial  nonlinearity  we 
refer  to  [Tem97,  Section  1.1], 


A. 2  Continuum  mean  curvature  flow 

Mean  curvature  flow  (MCF)  consists  of  the  evolution  of  a  closed,  oriented  hypersurface  St  C 
over  time,  such  that  the  inner  normal  velocity  at  a  given  point  of  St  is  equal  to  the  mean 
curvature  of  St  at  that  point.  The  study  of  such  a  flow  has  been  greatly  motivated  by  phase 
transition  models  in  crystal  growth  and  materials  science,  in  particular  since  the  important  work 
of  Allen  and  Cahn  [AC79].  Starting  with  the  seminal  work  of  Brakke  [Bra78],  the  mathematical 
study  of  this  flow  has  been  vast,  and  has  involved  areas  of  mathematics  ranging  from  differential 
geometry  to  stochastic  control.  The  use  of  MCF  is  now  widespread  in  the  modeling  of  moving 
fronts  [CS94].  The  reason  why  MCF  is  so  ubiquitous  in  the  phase  transitions  literature,  is 
that  many  singular  limits  of  reaction  diffusion  equations  (i.e.,  singular  limits  of  Ginzburg- 
Landau  dynamics)  converge  to  motion  by  mean  curvature.  See  [BK91,  Peg89,  BG95]  for  precise 
convergence  theorems  and  further  discussion. 

A  well  known  feature  of  MCF  is  both  the  formation  of  singularities  and  the  occurrence  of 
topological  changes,  regardless  of  the  smoothness  of  the  initial  data.  A  significant  portion  of  the 
literature  on  MCF  deals  with  notions  of  weak  solutions,  the  first  of  which  goes  back  to  Brakke 
[Bra78].  Partial  regularity  for  weak  solutions  as  well  as  regularity  up  to  the  first  singular  time 
have  been  widely  studied  [Eck04]. 

An  equivalent  formulation  of  the  flow  looks  not  only  at  the  hypersurface  £*,  but  at  the  entire 
domain  fit  bounded  by  it,  so  that  dflt  =  Xj.  Accordingly,  it  is  said  that  fit  itself  is  evolving  by 
mean  curvature  flow.  This  perspective  is  natural  for  phase  transitions. 

Let  cj)(t,  •) :  — >  M  be  the  signed  distance  function  to  the  set  f l  at  time  t.  From  the  level 
set  method  perspective  [OF03] ,  the  motion  by  mean  curvature16  of  fit  corresponds  to  an  initial 
value  problem  for  a  fully  non-linear  degenerate  parabolic  equation, 


<t>t  =  F(D2<i>,V<i>),  (/)(.,  0)  =  00)  where  F(7420,V0) 


—  |V</>|div 


V0 

W\' 


(46) 


16In  the  literature  two  related,  but  different,  concepts  of  mean  curvature  appear.  One  corresponds  with  the 
factor  diviHi  in  (46),  the  other  has  a  normalization  factor  where  d  is  the  dimension  of  the  space.  This 
normalization  by  the  dimension  of  the  hypersurface  justifies  the  “mean”  part  of  “mean  curvature” . 
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Then,  when  there  is  a  smooth  solution  <f>(x,t),  the  domains  given  by  Qt  '■=  {</>(•,£)  <  0}  will  be 
evolving  by  mean  curvature  flow  and  will  start  from  the  original  domain  fh  In  general,  even  for 
an  initial  domain  with  a  smooth  boundary,  a  smooth  solution  might  not  exist  for  all  times,  and 
one  must  work  with  viscosity  solutions.  In  this  context,  the  convergence  of  the  MBO  scheme 
(47)  (explained  in  Section  A. 3)  to  such  viscosity  solutions  was  proved  by  Evans  [Eva93]. 

It  is  worth  remarking  that  Soner  and  Touzi  in  [ST03]  interpret  MCF  as  a  stochastic  control 
problem.  In  this  interpretation,  one  controls  a  Brownian  motion  for  which  one  is  allowed  to 
turn  off  diffusion  in  one  given  direction.  The  surface  in  this  case  arises  as  the  set  of  points 
that  can  be  reached  with  probability  1.  This  probabilistic  interpretation  is  quite  different  from 
those  mentioned  in  the  discussion  at  the  end  of  Section  7. 

Finally,  given  the  affinity  with  the  graph  setting,  it  is  worthwhile  to  comment  briefly  on 
the  more  recent  nonlocal  mean  curvature  flow.  Caffarelli  and  Souganidis  [CS10]  arrive  at  this 
flow  by  following  a  nonlocal  and  continuum  analogue  of  (MBOr),  where  instead  of  using  the 
Laplacian  one  uses  a  fractional  power  of  the  Laplacian  (— A)s  with  s  E  (0,1/2).  A  level  set 
formulation  based  on  viscosity  solutions  was  developed  later  by  Irnbert  [Imb09]. 


A. 3  The  continuum  MBO  algorithm 

The  Merriman,  Bence,  and  Osher  (MBO)  algorithm  [MB092,  MB093,  MB094],  also  known 
as  the  threshold  dynamics  algorithm,  approximates  the  dynamics  of  mean  curvature  flow  (46) 
by  alternatively  applying  diffusion  and  thresholding  operators.  Let  x(f,  •)  be  the  characteristic 
function  of  the  set  at  time  t.  Define  the  diffusion  operator  xo  ^  u(t,  •)  :=  e_tAxo  to  be  the 
solution  of  the  initial  value  problem 


Define  the  threshold  operator 


Pu{x)  = 


0  u(x)  <  \ 

The  MBO  evolution  of  a  set  described  by  u  at  time  T  can  then  be  succinctly  written 

X(T,  •)  =  (Pe-rA)fex o,  where  r  =  T/k 


(47) 


is  the  ‘time  step’  and  k  is  a  parameter.  In  [Eva93,  BG95]  convergence  of  the  MBO  algorithm 
to  motion  by  mean  curvature,  defined  in  (46),  as  k  t  oo,  is  proven. 

The  MBO  scheme  and  its  implementation  has  evolved  considerably  since  its  original  pro¬ 
posal.  We  provide  a  small,  non-exhaustive,  overview  here.  In  [Mas92,  Ruu98a],  the  MBO 
scheme  was  extended  to  multiple-phase  problems.  In  [Ruu96,  Ruu98b],  a  spectral  discretization 
of  the  MBO  scheme  for  motion  by  mean  curvature  was  proposed,  which  is  much  more  efficient 
then  finite  difference  approaches.  This  approach  can  be  applied  to  both  two- phase  and  multi¬ 
phase  problems.  In  [ERT08],  diffusion  generated  motion  was  applied  to  higher  order  geometric 
motions.  In  [ET06],  the  MBO  scheme  was  extended  to  a  thresholding  method  for  approximat¬ 
ing  the  evolution  by  gradient  descent  of  the  Mumford-Shah  functional  and  applied  to  image 
segmentation  problems.  In  [ERT10]  the  authors  study  MBO-like  schemes  which  use  the  signed 
distance  function.  Recent  work  [E013]  presents  new  algorithms  for  multiphase  mean  curvature 
flow,  based  on  a  variational  description  of  the  MBO  scheme. 

It  is  well-known  that,  in  a  finite  difference  scheme  for  the  MBO  algorithm,  the  time  step  t 
(equivalently  k)  in  (47)  must  be  chosen  carefully  and  in  the  limit  as  k  f  oo,  the  discretized  MBO 
evolution  is  stationary.  In  fact,  when  discussing  the  numerical  implementation  of  the  algorithm 
on  a  discrete  grid,  Merriman,  Bence,  and  Osher  [MB092]  observe: 
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“The  basic  requirement  is  that  [the  time  step,  r,]  be  short  enough  so  that  the  local 
analysis  ...  is  valid,  but  also  long  enough  so  that  the  boundary  curve  moves  by  at 
least  one  grid  cell  on  the  spatial  grid  (otherwise  the  curve  would  be  stuck).” 

They  derive  heuristic  upper  and  lower  bounds  on  the  time  step,  r,  for  the  algorithm  to  approx¬ 
imate  motion  by  mean  curvature. 

B  Coarea  formula  for  anisotropic  graph  total  variation 

In  this  appendix  we  prove  a  layer  cake  or  coarea  formula  for  the  discrete  total  variation  in 
TVf.  Such  a  formula  is  useful  when  trying  to  minimize  the  total  variation  over  binary  valued 
functions.  The  formula  shows  that  such  minimization  can  be  achieved,  by  first  minimizing  over 
real- valued  functions  and  then  selecting  (almost  any)  sublevel  set  of  the  resulting  minimizer. 
This  will  give  a  binary  minimizer  of  the  total  variation.  Minimizing  the  total  variation  by  itself 
gives  a  trivial  solution,  so  the  usefulness  of  this  layer  cake  formula  hinges  on  the  ability  to  find 
a  similar  formula  for  the  fidelity  term  or  constraint  that  is  applied  to  the  minimization.  An 
example  where  this  technique  is  used  for  the  continuum  isotropic  total  variation  is  [CEN06].  In 
[CvGOll]  it  is  used  for  a  continuum  anisotropic  total  variation. 

Recall  from  Section  2  that 


TV^(u)  :=  max{(div  ip,  u)v  :  ip  E  £,  |Mkoo  <  1} 

=  (Vu,  <pa))s  =  *  E  uUUi  ~  “ji 
i,jev 

where  a  particular  choice  of  pa  is  given  in  (1):  ipa  :=  sgn(Vti).  Since,  for  given  u  E  V,  and  for 
all  i.  j  E  V,  ( Ui  —  Uj)ip “■  <  0,  an  equivalent  definition  of  TV|  is 

TV^(«)  =  max{(div<^,  u)y :  ip  E  £L(rt)}, 

where 

£-{u)  :={(/)  e  £:  ||^||£r,oo  <  1,  Vi,  j  E  V  (ui  -  Uj)4>ij  <  0}. 

Lemma  B.l.  Let  u  E  V  and  define 


E(t )  :=  {i  E  V:  Ui  >  t}. 


Then 


1 


TK{n)  =  ^  E  “l 

i,j£ 

Proof.  Fix  u  E  V.  Note  that 


1 


r»  /  — 1 1  I  ^  ^7  —  r* 

2  LJ'  J  2 

i,j£V 


JR  E  uij 


dt. 


Ui=  X[0,ui](t)  dt  =  /  (; XE(t))idt ,  (48) 

J  M  J  M 

from  which  it  immediately  follows  that 


E  Ui  ~Uj\=  E  Uij 

/ 

{XE(t))i  -  (xs(t))j_ 

dt 

s  L  z  < 

(■ XE(t))i  ~  (XE(t))  j 

i,j£V 

J  R 

jR  i,jev 
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To  prove  the  opposite  inequality,  we  follow  closely  the  argument  used  to  prove  the  analogous 
result  in  the  continuum  case,  in  [EG92,  5.5  Theorem  1],  Let 

k (*)  :=  E  E  “UUi  ~  ui I  =  E  E  uij\Ui  ~  ui i • 

ieV\E(t)  j&v  iev  ■.  Ui<t  j&v 


Since  k  is  a  monotone  nondecreasing  function,  the  derivative 

k'(f) = *  E  E<4i  ui~ui 

i£V :  t<Ui<t+r  jdV 


exists  almost  everywhere.  Hence 


(49) 


(50) 


We  claim  that,  for  ip  G  £~(u), 


k'(t)  dt. 


(51) 


If  this  claim  is  true,  then  by  taking  the  maximum  over  all  ip  G  L,  just  as  in  the  discussion 
immediately  preceding  this  lemma,  it  follows  that 


dt  < 


k'(t )  dt. 


By  (50)  the  result  then  follows. 

We  now  prove  the  claim  (51).  First  note  that 


E  uij  (*m)i  foi  “  Vij)  =  E  E  “  ¥>v)- 

i,jev  ieE(t)jev 


Fix  1  El  and  r  >  0,  and  define  the  function  tj  G  C(M)  as 


r/(s) 


0  if  s  <  t, 

^  if  t  <  s  <  t  +  r, 

1  if  s  >  t  +  r. 


Clearly 


J  0  if  s  <  t  or  s  >  t  +  r, 

\  y  if  t  <  s  <  t  +  r 


Let  0  G  C'°°(M)  be  a  mollifier  with  supp  <p  G  [—1,1]  and  fR</>  =  1,  and  define  for  <5  >  0, 
</>s(x)  :=  (|).  Let  r]s  ■=  4>S  *  V:  then  rjs  G  C'°°(M),  rjs  —>  r]  a.e.  as  5  — >  0,  and  rjs  —>  r] 

uniformly  on  compact  subsets  of  M  as  <5  — >  0  ( e.g .,  [EvalO,  Appendix  C,  Theorem  7]).  The 
Taylor  series  with  remainder  gives  us  that  for  all  rq,  Uj  G  M  there  exists  a  G  [ui,  Uj\  U  \uj ,  Uj\ 
such  that 

Vs{uj)  =  rj&{ui)  +  rjs(uj)(uj  -  m)  +  -ris(&j)(uj  ~  uif  ■ 

Writing  5t  for  the  Dirac  delta  distribution  centered  at  t  G  M,  we  compute 


Vs(s)  =  if  *  Ms) 


-(6t  -  St+r)  *  <j>s(s) 

r 


1 

r 


(4>s(s  -  t)  -  (j>s(s  -  t  -  r)). 
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We  note  that  since  supp  (j)'s  C  [—<5,  <5]  we  have  supp  rfs  C  [t  —  5,  t  +  r  +  <5],  Using  the  symmetry 
of  Uij  and  antisymmetry  of  ipij  we  now  find 

Y  -  <Pij)  =  l  Y  (w(“<)  “  vs{uj))  (ipji  -  <pij) 

i,j£V  i,j£V 

=  Y  u ij  ^s(ui)  -  vs(uj))  ip# 
i,jev 

=  -  Y  Uij  ~Ui)  +  \vs(Zij)(Uj  -  Mi)2  )  <Pji 

i,jev  v  J 

=  Y  Y11'^u^UJUUi~u^i  (52) 

i£V :  t—S<Ui<t+r+5  j£V 

+  Yr  Y  —  t  —  r)  —  ~  t)){ui  -  Ujftpji 

i,j£V 


Integrating  over  M  and  taking  the  limit  <5—^0,  for  the  first  term  in  (52)  we  compute 


lim 

(5-s-O 


<lim 

(5-5-0 


Y  Y  rls(.Ui)UJij(Ui  ~  Uo)^3i  dt 

i£.V :  t—  5<Ui<t-\-r-\-S  j£V 

Y  YrU^Ui~u^idt 

i&V :  t— 5<m<t+r+5  j£V 


I  j™  Y  Y^iM-uj^jidt 

®  i£V :  t-8<Ui<t+r+6  j£V 


Uj)tpjidt, 


where  we  have  used,  that  by  definition  of  £_(u),  (ui-Uj)ipji  >  0,  and  the  monotone  convergence 
theorem. 

For  the  second  term  in  (52)  we  find 

7^;  I™  j  Y  ~  f  ~  r)  ~  ~  t))(ui  -  Ujfipji  dt 

r  ~ *  jRi,jeV 

=  w]im  Y  uijVji(ui  ~  ui)2  [  (foitij  -t-r)  -  4>s{Cij  ~  *))  dt  =  0, 

r  ijev  dm 


since  JR  (f)g  =  1. 

Combining  the  above  we  have 


where  we  used  the  dominated  convergence  theorem  in  the  second  line. 


(53) 
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Finally,  we  conclude,  by  the  definition  of  rj,  (49),  and  (53), 


I  Y  Y^i~^ujijdt  = 

Ri£E(t)j£V 


< 


<  /  k'(t)  dt, 

Jr 


which  proves  (51).  The  second  and  third  identities  follow  from  Lebesgue’s  dominated  conver¬ 
gence  theorem,  using  repeatedly  the  fact  that  u  is  fixed  and  hence,  by  the  finitude  of  the  graph 
G ,  u  has  compact  support  and  a  finite  range.  The  final  inequality  follows  since  ip  satisfies 
IMkoo  <i-  □ 


C  Calculation  of  the  first  variation  for  graph  total  variation 

In  (28)  we  computed  jk=0TV|(u  +  tv)  =  (sgn(Vtt),  Vu)  using  the  convexity  of  TVf.  In  this 
section  we  review  this  fact  to  other  kinds  of  graph  total  variation,  which  are  expressible  as 

TV(u)  :=  max{(div  ip,  ri)y  :  ip  £  A},  (54) 

where  A  C  £  is  some  admissible  set  of  edge  functions,  such  that  a  maximizer  pu  £  A  exists 
(even  if  it  might  not  be  unique).  The  key  fact  is  that  such  a  TV{u)  is  convex  and  might  be 
studied  via  convex  analysis.  The  convexity  of  TV  is  evident  from  its  definition:  u  — >  TV{u)  is 
a  scalar  valued  function  given  as  the  maximum  of  a  family  of  linear  functions  u  i— >  (divy?,  rt)y. 
Let  us  recall  some  concepts  from  convex  analysis  [ET76,  Chapter  1,  Section  5],  in  particular, 
the  subdifferential  of  TV  at  u.  This  set  valued  function  is  denoted  by  dTV{u)  and  given  by 

dTV(u )  :=  {v  £  V  |  TV(u )  <  oo  and  V  w  £  V  TV(w)  >  TV{u)  +  {v,  w  — 

That  is,  v  £  TV{u)  if  and  only  if  it  is  the  slope  of  an  affine  function  which  is  tangent  to  the 
graph  of  TV  at  u.  In  particular,  at  the  points  where  TV(ii)  is  differentiable  dTV(u)  consists  of 
a  single  element:  the  gradient  of  TV(u )  at  u. 

In  the  particular  case  of  TV,  it  follows  that 

dTV{u )  =  {v  £  V  I  (v,  u)v  =  TV{u)}.  (55) 

Indeed,  note  that  by  (54),  for  any  w  £  V,  TV(w)  =  (div</T,u;)y,  where  pw  £  A  is  a  maximizer 
in  (54).  It  follows  that,  if  u  £  V  is  given,  then 

v  £  dTV{u )  (div  pw,w) y  >  (div  <pu,  u)y  +  (divu,  w  —  it)y. 

By  choosing  w  =  0  and  w  =  2 u,  respectively,  we  find  (u,rt)y  =  (div  </T,  u)y  =  TV(u).  This 
proves  the  set  identity  (55)  for  any  u  £  V. 


50 


On  the  other  hand,  the  convexity  of  TV{u)  implies  it  is  a  locally  Lipschitz  function,  making 
it  differentiable  for  a.e.17  u  6  V  by  Rademacher’s  theorem  [EG92,  Chapter  6,  Section  6.2, 
Theorem  2],  Therefore,  for  a.e.  u  £  V,  dTV{u)  contains  a  single  element  v  =  di vipu.  Then,  for 
a.e.  u ,  it  follows  that 

—  TV(u  +  tv)  =  (div  ipu,  v)v 
dt  |4=o 

Now  we  can  consider  particular  choices  for  TV  and  hence  for  A.  For  TV  =  TVf,  we  have 
p u  =  ipa  from  (1),  as  already  explained  in  Remark  3.11.  Similar  computations  can  be  done  if 
we  take  Al’s  corresponding  to  <pu  given  respectively  by 


ipu(u)  =  <p£(u) 


^7  if||Vu||^0, 
0  if  || Vt/||f  =  0, 


and  <p%{u)  =  <pjy  {u) 


w  if|V«|i/0, 
0  if  \\7u\i  =  0, 


i.e.,  optimal  </?’s  for  TV[u)  =  ||V-u||,f18  and  TV(u)  =  TV(tt),  respectively  (see  Section  2). The 
previous  analysis  shows  the  first  variations  in  these  cases  are  given  by  the  V-inner  product  with 


div<£> 


£ 


(div  <*£>TV); 


if||Vn||£/0, 
0  if||Vu||£  =  0, 


-E 


j&V  uij 


TTij 

|Vu|i 


if  |Vu|j  +  0, 
if  |Vn|j  =  0. 


For  the  latter  we  can  also  write 


(div^TV);  =  -dt  rY^ 


jev 


q  I  Uj  Uj  Uj  Uj 

ij  \  |V«|j  |Vu|j 


j&V 


1  1 

+ 


\Vu\j  |Vu|, 


(Ui  ~Uj), 


where  we  have  to  remember  that  ujf-  “y and  jyyjr  are  to  be  interpreted  as  0  whenever  \Tu\j  = 

0  for  any  j  €  V  (including  j  =  i).  Because  the  node  function  div(/?TV  is  the  first  variation  of 
the  isotropic  graph  total  variation,  in  the  literature  it  is  sometimes  referred  to  as  curvature  or 
1-Laplacian.  In  this  paper  we  have  argued  why  the  use  of  the  anisotropic  total  variation  TVa 
to  define  curvature,  as  in  (14),  is  a  more  natural  choice  on  graphs. 
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